Cacti / cacti

Cacti ™
http://www.cacti.net
GNU General Public License v2.0
1.6k stars 397 forks source link

SPINE Poller reports the devices down, however, the devices are up #5735

Closed MSS970 closed 2 months ago

MSS970 commented 2 months ago

Describe the bug

Two Errors:

  1. SPINE: Poller[Main Poller] Device[device-name] Hostname[ip_address] ERROR: HOST EVENT: Device is DOWN Message: ICMP: Ping timed out
  2. SPINE: Poller[[Main Poller] Device[device-name] Hostname[ip_address] ERROR: HOST EVENT: Device is DOWN Message: Device did not respond to SNMP, ICMP: Device is Alive

However, when click on the device (edit) page, both SNMP and ICMP test results show the device is up. When attempting to access the device, it is accessible, up, running, operational.

To Reproduce

Steps to reproduce the behavior:

  1. Go to logs, search for SPINE logs with error.
  2. The above 2 kind of errors are found.
  3. Click on the device, the device is found up and running, the SNMP and ICMP test results are displayed.

Can you kindly extend your support to fix this problem.

bmfmancini commented 2 months ago

What availability check do you have set for the device

Is it just ping or snmp and ping?

Also can you run spine like this Where device id is the ID of the device showing down

. /spine -R -v 5 - f device_ID -l device_id

On Wed, Apr 24, 2024, 23:55 mohabuhtig @.***> wrote:

Describe the bug

Two Errors:

  1. SPINE: Poller[Main Poller] Device[device-name] Hostname[ip_address] ERROR: HOST EVENT: Device is DOWN Message: ICMP: Ping timed out
  2. SPINE: Poller[[Main Poller] Device[device-name] Hostname[ip_address] ERROR: HOST EVENT: Device is DOWN Message: Device did not respond to SNMP, ICMP: Device is Alive

However, when click on the device (edit) page, both SNMP and ICMP test results show the device is up. When attempting to access the device, it is accessible, up, running, operational. To Reproduce

Steps to reproduce the behavior:

  1. Go to logs, search for SPINE logs with error.
  2. The above 2 kind of errors are found.
  3. Click on the device, the device is found up and running, the SNMP and ICMP test results are displayed.

    • OS: Cacti 1.3 [dev] on Windows 2019 server
    • SPINE: 1.3
    • FPING: 4.2 for Windows
    • Net-SNMP 5.9.3 for Windows

Can you kindly extend your support to fix this problem.

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5735, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTHWDEQXBSV2SVMAAO3Y7B5C3AVCNFSM6AAAAABGYCEJYSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3DENJZGEYDGNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

MSS970 commented 2 months ago

Hi Sean, What availability check do you have set for the device. Is it just ping or snmp and ping?

However, it the same issue for both above type of devices.

. /spine -R -v 5 - f device_ID -l device_id Below are the results:

D:>spine -R -v 5 - f 36 -l 36 SPINE 1.3.0 Copyright 2004-2023 by The Cacti Group

D:>

bmfmancini commented 2 months ago

Sorry it's. -V

Do you have fping installed?

On Thu, Apr 25, 2024, 07:42 mohabuhtig @.***> wrote:

Hi Sean, What availability check do you have set for the device. Is it just ping or snmp and ping?

  • for some devices, I've configured the availability check with ping only as the SNMP is not required in these cases.
  • other devices, the availability check is configured with snmp and ping.

However, it the same issue for both above type of devices.

. /spine -R -v 5 - f device_ID -l device_id Below are the results:

D:>spine -R -v 5 - f 36 -l 36 SPINE 1.3.0 Copyright 2004-2023 by The Cacti Group

D:>

— Reply to this email directly, view it on GitHub https://github.com/Cacti/cacti/issues/5735#issuecomment-2076986530, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADGEXTFB574UP3AALF3EZW3Y7DT3RAVCNFSM6AAAAABGYCEJYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZWHE4DMNJTGA . You are receiving this because you commented.Message ID: @.***>

MSS970 commented 2 months ago

Hi Sean, fping is installed and the device is pingable using Windows ping and fping, however, I am using fping. ICMP ping as the OS is Windows.

. /spine -R -V 5 - f device_ID -l device_id Below are the results:

D:\cacti\spine>spine -R -V 5 -f 36 -l 36 SPINE: Using spine config file [spine.conf] 0 [] spine 1527 cygwin_exception::open_stackdumpfile: Dumping stack trace to spine.exe.stackdump

D:\cacti\spine>

and below is the content of the spine.exe.stackdump:

Exception: STATUS_STACK_OVERFLOW at rip=0001004137B6 rax=000000000000E200 rbx=0000000100426060 rcx=00000007FFE03DD0 rdx=00000001004671CC rsi=0000000000000000 rdi=0000000100497E00 r8 =0000000A0005DF50 r9 =00000000FFFFFFFE r10=0000000800000000 r11=0000000100405462 r12=00000007FFF01440 r13=0000000100497E10 r14=0000000A0005E070 r15=000000010041B75D rbp=00000007FFF00E40 rsp=00000007FFF00DB8 program=D:\cacti\spine\spine.exe, pid 1527, thread cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B Stack trace: Frame Function Args 0007FFF00E40 0001004137B6 (000100402A75, 000100426060, 000000000000, 000100497E00) spine.exe+0x137B6 0007FFF00E40 00000010A200 (000100426060, 000000000000, 000100497E00, 0007FFF01440) 0007FFF00E40 00010041B780 (000000000000, 000100497E00, 0007FFF01440, 000100497E10) spine.exe+0x1B780 0007FFF00E40 000100402A75 (00010041B3CC, 00010041B6AB, 000A0005E070, 000000000000) spine.exe+0x2A75 0007FFF00E40 000100405EC3 (000000000000, 000000000000, 000100425040, 000100418DBB) spine.exe+0x5EC3 000100418DB6 000100414687 (0007FFFFCC60, 000000000000, 00000000000A, 0007FFFFCD30) spine.exe+0x14687 0007FFFFCD30 7FF9BA0F80C1 (000000000000, 000000000000, 000000000000, 000000000000) cygwin1.dll+0x80C1 0007FFFFFFF0 7FF9BA0F5C86 (000000000000, 000000000000, 000000000000, 000000000000) cygwin1.dll+0x5C86 0007FFFFFFF0 7FF9BA0F5D34 (000000000000, 000000000000, 000000000000, 000000000000) cygwin1.dll+0x5D34 End of stack trace Loaded modules: 000100400000 spine.exe 7FFA0CBC0000 ntdll.dll 7FFA0C230000 KERNEL32.DLL 7FFA09390000 KERNELBASE.dll 0003FE3B0000 cygmariadb-3.dll 7FF9BA0F0000 cygwin1.dll 0003FE460000 cygnetsnmp-35.dll 0003FF980000 cygcrypto-1.1.dll 0003FE7C0000 cygiconv-2.dll 7FFA0B920000 ADVAPI32.dll 7FFA0C420000 msvcrt.dll 0003FCE60000 cygssl-1.1.dll 0003FC800000 cygz.dll 7FFA0BB20000 sechost.dll 7FFA09E60000 RPCRT4.dll 7FFA08EE0000 bcrypt.dll 7FFA08530000 CRYPTBASE.DLL 7FFA08FC0000 bcryptPrimitives.dll 7FF9FD9F0000 netapi32.dll 7FFA08190000 LOGONCLI.DLL 7FFA090E0000 ucrtbase.dll 7FFA08180000 NETUTILS.DLL 7FFA0A090000 wldap32.dll 7FFA0BAA0000 WS2_32.DLL 7FFA08360000 mswsock.dll 7FFA08AA0000 SspiCli.dll 7FFA028A0000 DSPARSE.dll 7FFA08420000 kerberos.DLL 7FFA08BF0000 MSASN1.dll 7FFA08E40000 msvcp_win.dll 7FFA083D0000 cryptdll.dll 7FFA03D90000 wshqos.dll 7FFA03BB0000 wshtcpip.DLL 7FFA03B20000 wship6.dll 7FFA080B0000 DNSAPI.dll 7FFA0BB10000 NSI.dll 7FFA08070000 IPHLPAPI.DLL 7FFA02E00000 rasadhlp.dll 7FFA04AA0000 fwpuclnt.dll 7FFA04C20000 SAMCLI.DLL 7FF9FFA50000 SAMLIB.dll 7FFA0A150000 user32.dll 7FFA090A0000 win32u.dll 7FFA0A100000 GDI32.dll 7FFA091E0000 gdi32full.dll 7FFA0CB60000 IMM32.DLL 7FF9FEA20000 napinsp.dll 7FF9FEA90000 winrnr.dll 7FFA03BC0000 NLAapi.dll 7FF9FEAC0000 wshbth.dll

TheWitness commented 2 months ago

Run spine using gdb and run a backtrace after it fails. How is fping entered in the settings? Use forward slashes and not back slashes.

TheWitness commented 2 months ago

Duplicate of spine issue.