prometheus / blackbox_exporter

Blackbox prober exporter
https://prometheus.io
Apache License 2.0
4.44k stars 1.03k forks source link

Error writing to socket" err="write ip 0.0.0.0->${DST_IP}: sendmsg: message too long. icmp module. #1249

Open 0megam opened 1 month ago

0megam commented 1 month ago

Hello. We are running blackbox_exporter as a docker container (official image) to probe some of our infrastructure with icmp module. For some reason after few days/weeks of constant probing blackbox exporter starting to returned probe_success metric with value 0 for some targets. If I curl blackbox_exporter with this target and &debug=true parameter I'm getting following output:

curl 'http://192.168.10.2:9115/probe?module=icmp&target=${DST_IP}&debug=true'
Logs for the probe:
ts=2024-05-30T08:08:07.411598137Z caller=main.go:181 module=icmp target=${DST_IP} level=info msg="Beginning probe" probe=icmp timeout_seconds=3
ts=2024-05-30T08:08:07.411699339Z caller=icmp.go:91 module=icmp target=${DST_IP} level=info msg="Resolving target address" target=${DST_IP} ip_protocol=ip6
ts=2024-05-30T08:08:07.411718022Z caller=icmp.go:91 module=icmp target=${DST_IP} level=info msg="Resolved target address" target=${DST_IP} ip=${DST_IP}
ts=2024-05-30T08:08:07.411733126Z caller=handler.go:120 module=icmp target=${DST_IP} level=info msg="Creating socket"
ts=2024-05-30T08:08:07.411798973Z caller=handler.go:120 module=icmp target=${DST_IP} level=info msg="Creating ICMP packet" seq=26515 id=60301
ts=2024-05-30T08:08:07.41186266Z caller=handler.go:120 module=icmp target=${DST_IP} level=info msg="Writing out packet"
ts=2024-05-30T08:08:07.411891262Z caller=handler.go:120 module=icmp target=${DST_IP} level=debug msg="Overriding TTL (raw IPv4)" ttl=64
ts=2024-05-30T08:08:07.412139679Z caller=handler.go:120 module=icmp target=${DST_IP} level=warn msg="Error writing to socket" err="write ip 0.0.0.0->${DST_IP}: sendmsg: message too long"
ts=2024-05-30T08:08:07.412223981Z caller=main.go:181 module=icmp target=${DST_IP} level=error msg="Probe failed" duration_seconds=0.000571897

Metrics that would have been returned:
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 1.7268e-05
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.000571897
# HELP probe_icmp_duration_seconds Duration of icmp request by phase
# TYPE probe_icmp_duration_seconds gauge
probe_icmp_duration_seconds{phase="resolve"} 1.7268e-05
probe_icmp_duration_seconds{phase="rtt"} 0
probe_icmp_duration_seconds{phase="setup"} 0.000128335
# HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes.
# TYPE probe_ip_addr_hash gauge
probe_ip_addr_hash 2.164536847e+09
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 0

Module configuration:
prober: icmp
timeout: 3s
http:
  ip_protocol_fallback: true
  follow_redirects: true
  enable_http2: true
tcp:
  ip_protocol_fallback: true
icmp:
  ip_protocol_fallback: true
  payload_size: 1472
  dont_fragment: true
  ttl: 64
dns:
  ip_protocol_fallback: true
  recursion_desired: true

Problem resolves after blackbox_exporter container has been restarted, but will resurfaces again after few days/weeks. Worth mention that other destinations on same blackbox_exporter with using same module are working fine, only few specific ones returning sendmsg: message too long.

Host operating system: output of uname -a

Linux host-1 5.15.0-102-generic #112-Ubuntu SMP Tue Mar 5 16:50:32 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

blackbox_exporter version: output of blackbox_exporter --version

blackbox_exporter, version 0.24.0 (branch: HEAD, revision: 0b0467473916fd9e8526e2635c2a0b1c56011dff)
  build user:       root@e5bbfcc8184e
  build date:       20230516-11:07:25
  go version:       go1.20.4
  platform:         linux/amd64
  tags:             netgo

What is the blackbox.yml module config.

modules:
  icmp:
    icmp:
      dont_fragment: true
      payload_size: 1472
    prober: icmp
    timeout: 3s