phaag / nfsen

Legacy NfSen code
Other
23 stars 9 forks source link

Segmentation Fault when starting NfSen collectors #2

Closed mkoelewijn closed 1 year ago

mkoelewijn commented 1 year ago

Installed the latest nfsen and nfdump from git on Debian 11 which results in a segmentation fault:

Starting nfcapd:(charly): collector did not start - see logfile

nfcapd[31021]: segfault at 0 ip 00007fdf3a6c5518 sp 00007ffd45b8b1f0 error 4 in libc-2.31.so[7fdf3a6aa000+15a000] Dec 13 11:13:06 netflow kernel: [60445.744457] Code: 41 54 45 31 e4 55 53 48 83 ec 28 48 89 74 24 08 85 c9 0f 85 c2 02 00 00 83 ff 01 0f 84 81 01 00 00 83 ff 24 0f 87 78 01 00 00 <49> 0f be 55 00 49 8b 48 68 4c 89 eb 48 89 d0 f6 44 51 01 20 74 15

Tried to use the unstable version of Debian > libc, but same problem:

2022-12-13T11:38:43.964787+01:00 netflow kernel: [ 255.874090] nfcapd[1316]: segfault at 0 ip 00007f5593efab17 sp 00007ffcf3c89fa0 error 4 in libc.so.6[7f5593ed8000+155000] 2022-12-13T11:38:43.964791+01:00 netflow kernel: [ 255.874097] Code: 89 d7 48 89 74 24 10 85 c9 0f 85 b4 02 00 00 83 ff 01 0f 84 9b 00 00 00 83 ff 24 0f 87 92 00 00 00 48 8b 5c 24 08 49 8b 48 68 <48> 0f be 13 48 89 d0 f6 44 51 01 20 74 16 0f 1f 00 48 0f be 53 01

Earlier version of nfdump and nfsen worked fine on the current server with a AMD EPYC 7443P 24-Core Processor.

tim427 commented 1 year ago

In addition the used versions; Debian Bullseye with glibc (2.31-13+deb11u4) and Debian Testing with glibc (2.36-6)

phaag commented 1 year ago

I have all nfdump tools running on Debian Bullseye without any problem. Are you sure, you have installed everything correctly? If so, please try to generate a coredump and send it to me.

mkoelewijn commented 1 year ago

A same install on a other machine seems te be working fine, see attached for the core dump

phaag commented 1 year ago

What version of nfdump tools are you using?

phaag commented 1 year ago

The core seems to be somehow not working. If it works on other machines, could it be a library problem? Have you tried to run gdb manually? If it crashes run bt.

mkoelewijn commented 1 year ago

I think i found the problem what is causing the segmentation fault, i am using the following source config:

'test' => { 'port' => '29002', 'col' => '#F08800', 'type' => 'netflow', 'optarg' => '-s -100' },

When i remove the " 'optarg' => '-s -100' " option it is working fine. This worked with older versions of nfsen, do i need to do something else to define the sampling rate?

mkoelewijn commented 1 year ago

Nothing wrong with using that in the nfsen conf. The sampling (-s) option is causing the crash.

nfcapd -p 29004 -u netflow -g www-data -B 200000 -S 1 -w /data/nfsen/profiles-data/ -s -100

phaag commented 1 year ago

so can you run a debug trail with gdb?

dbg /usr/local/bin/nfcapd gdb> run -p 29999 -u netflow -g www-data -B 200000 -S 1 -w /tmp -s -100

When it crashes:

gdb> bt

and send me the output

mkoelewijn commented 1 year ago

(gdb) run -p 29999 -u netflow -g www-data -B 200000 -S 1 -w /tmp -s -100 Starting program: /usr/local/bin/nfcapd -p 29999 -u netflow -g www-data -B 200000 -S 1 -w /tmp -s -100 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7d74b17 in __GI_____strtol_l_internal (nptr=0x0, endptr=endptr@entry=0x0, base=base@entry=10, group=group@entry=0, loc=0x7ffff7eff560 <_nl_global_locale>) at ../stdlib/strtol_l.c:291 291 ../stdlib/strtol_l.c: No such file or directory.

(gdb) bt

0 0x00007ffff7d74b17 in __GI_____strtol_l_internal (nptr=0x0, endptr=endptr@entry=0x0, base=base@entry=10, group=group@entry=0, loc=0x7ffff7eff560 <_nl_global_locale>) at ../stdlib/strtol_l.c:291

1 0x00007ffff7d74ac2 in __strtol (nptr=, endptr=endptr@entry=0x0, base=base@entry=10) at ../stdlib/strtol.c:106

2 0x0000555555558b94 in main (argc=15, argv=0x7fffffffe4a8) at nfcapd.c:756

phaag commented 1 year ago

in nfcapd.c you see the line:

while ((c = getopt(argc, argv, "46B:b:C:DeEf:g:hI:i:jJ:l:m:M:n:p:P:rRs::S:t:T:u:vVw:x:yzZ")) != EOF) {

In this string you see s::S. Remove one ':' so it reads s:S

Recompile nfcapd and test.

phaag commented 1 year ago

.. or checkout latest master repo.

mkoelewijn commented 1 year ago

Thank you, it works now!