phaag / nfsen

Legacy NfSen code
Other
23 stars 9 forks source link

nfsen Run nfdump failed: Exit: 1, Signal: 0, Coredump: 0 #15

Closed atbohmer closed 1 year ago

atbohmer commented 1 year ago

Hello,

Since a week and a half nfsen 1.3.9 stopped generating data. A snippit from the messages file:

Jun 5 15:10:16 nfsen[9373]: Update profile live in group . Jun 5 15:10:16 nfsen[9373]: Run nfdump failed: Exit: 1, Signal: 0, Coredump: 0

ndump and nfsen are up2date via git. Running on REL 7.

Please advise how to debug or better solve ;-)

Thanks, Andre

atbohmer commented 1 year ago

More info: Jun 5 15:25:16 kernel: nfdump[22725]: segfault at 2d0 ip 00007f9799b4cf81 sp 00007ffc3051ff50 error 4 in libpthread-2.17.so[7f9799b44000+17000] Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 kernel: nfdump[22728]: segfault at 2d0 ip 00007fe51e0d0f81 sp 00007fff0789a5c0 error 4 in libpthread-2.17.so[7fe51e0c8000+17000] Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 kernel: nfdump[22731]: segfault at 2d0 ip 00007ff74a38ff81 sp 00007ffda1cdc900 error 4 in libpthread-2.17.so[7ff74a387000+17000] Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 kernel: nfdump[22734]: segfault at 2d0 ip 00007f4bca09ef81 sp 00007fff8408a6e0 error 4 in libpthread-2.17.so[7f4bca096000+17000] Jun 5 15:25:16 server kernel: nfdump[22737]: segfault at 2d0 ip 00007f92bd9c5f81 sp 00007ffed68bca60 error 4 in libpthread-2.17.so[7f92bd9bd000+17000] Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server kernel: nfdump[22740]: segfault at 2d0 ip 00007f03a9d87f81 sp 00007ffc4e35ee80 error 4 in libpthread-2.17.so[7f03a9d7f000+17000] Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server kernel: nfdump[22743]: segfault at 2d0 ip 00007f3a0610af81 sp 00007ffc0417ae90 error 4 in libpthread-2.17.so[7f3a06102000+17000] Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server kernel: nfdump[22746]: segfault at 2d0 ip 00007f8f2d88ff81 sp 00007fffd4d1e690 error 4 in libpthread-2.17.so[7f8f2d887000+17000] Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server kernel: nfdump[22749]: segfault at 2d0 ip 00007fcb99c30f81 sp 00007ffc89e3d4a0 error 4 in libpthread-2.17.so[7fcb99c28000+17000] Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server nfsen[20774]: Run nfdump failed: Exit: 0, Signal: 11, Coredump: 0 Jun 5 15:25:16 server kernel: nfdump[22752]: segfault at 2d0 ip 00007f4d39c07f81 sp 00007ffdb9bf2c00 error 4 in libpthread-2.17.so[7f4d39bff000+17000]

atbohmer commented 1 year ago

With gdb: Program received signal SIGSEGV, Segmentation fault. 0x00007ffff79adf81 in pthread_join () from /lib64/libpthread.so.0

Sorry, missing some debug packages at the moment: Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-326.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64

phaag commented 1 year ago

There is nothing you can do wrong with pthread_join() … so .. no idea. Does nfdump run on the command line? Try to recompile nfdump. Maybe, there is a library issue.

atbohmer commented 1 year ago

Command line also fails, not direct but during the run:

$ gdb --args nfdump -M /nflow/nfsen/profiles-data/live/router -T -R 2023/03/23/nfcapd.202303230835:2023/03/23/nfcapd.202303231030 -n 10 "IP 10.x.x.x" [Thread 0x7ffff73c6700 (LWP 19714) exited] [New Thread 0x7ffff69af700 (LWP 19716)] 2023-03-23 08:36:11.950 00:00:00.450 TCP 10.:49674 -> 10.x:80 9 1886 1 2023-03-23 08:36:15.538 00:00:02.000 TCP 10.:49738 -> 10.x:443 99 28173 1 2023-03-23 08:36:15.730 00:00:00.050 TCP 10.:49739 -> 10.x:389 8 2682 1 2023-03-23 08:36:15.730 00:00:00.050 TCP 10.:49740 -> 10.x88 6 2468 1 ...... data .... [Thread 0x7ffff69af700 (LWP 19716) exited] 2023-03-23 08:39:45.842 00:00:04.150 TCP 10.x:50366 -> 10.x:514 15 5062 1 2023-03-23 08:39:50.026 00:00:04.050 TCP 10.x:50368 -> 10.x:514 15 4926 1

Program received signal SIGSEGV, Segmentation fault. 0x00007ffff79adf81 in pthread_join () from /lib64/libpthread.so.0 Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-326.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64

atbohmer commented 1 year ago

$ rpm -qfi /lib64/libpthread.so.0 Name : glibc Version : 2.17 Release : 326.el7_9 Architecture: x86_64 Install Date: Thu 09 Jun 2022 02:38:23 PM CEST

-rwxr-xr-x. 1 root root 142144 Mar 22 2022 /lib64/libpthread-2.17.so

atbohmer commented 1 year ago

So libpthread did not change recently. I try to dig some further

atbohmer commented 1 year ago

For the record : .... Date first seen Duration Proto Src IP Addr:Port Dst IP Addr:Port Packets Bytes Flows [New Thread 0x7ffff69af700 (LWP 5313)] .. data .. [Thread 0x7ffff69af700 (LWP 5313) exited]

Program received signal SIGSEGV, Segmentation fault. 0x00007ffff79adf81 in pthread_join () from /lib64/libpthread.so.0 Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-326.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 (gdb) bt

0 0x00007ffff79adf81 in pthread_join () from /lib64/libpthread.so.0

1 0x00007ffff7fca73a in SignalTerminate (nffile=0x477a50) at nffile.c:1481

2 0x00007ffff7fc9910 in CloseFile (nffile=0x477a50) at nffile.c:1090

3 0x00007ffff7fc9d34 in GetNextFile (nffile=0x477a50) at nffile.c:1177

4 0x0000000000408b84 in process_data (wfile=0x0, element_stat=0, flow_stat=0, sort_flows=0, print_record=0x420d1f , timeWindow=0x0, limitRecords=0, outputParams=0x477010, compress=0) at nfdump.c:338

5 0x000000000040ab80 in main (argc=9, argv=0x7fffffffe3c8) at nfdump.c:1093

phaag commented 1 year ago

Fixed! Although it seems to be a RH pthread library bug, it's good programming style to check first the thread ID.