sdnfv / openNetVM

A high performance container-based NFV platform from GW and UCR.
http://sdnfv.github.io/onvm/
Other
263 stars 136 forks source link

onvm_nflib_init() segmentation fault #163

Closed archit-p closed 4 years ago

archit-p commented 4 years ago

Hello team,

I'm working on porting onvm-snort to v19.07 of openNetVM. After updating the daq_netvm.c implementation to match the current API, daq-2.0.6 compiles fine, however runs into some runtime errors.

Link to repo

Weirdly, snort returns a segmentation fault when run. The backtrace when run under gdb is:

#0  0x0000000000000000 in ?? ()
#1  0x00005555556a4215 in onvm_nflib_init_nf_init_cfg ()
#2  0x00005555556a462e in onvm_nflib_init ()
#3  0x000055555569ea51 in netvm_daq_initialize (config=0x7fffffffe3f0, ctxt_ptr=0x5555560eafb0 <daq_hand>, errbuf=0x7fffffffe2c0 "", errlen=256)
    at daq_netvm.c:220
#4  0x00005555555bd764 in DAQ_Config (cfg=0x7fffffffe3f0) at sfdaq.c:515
#5  0x00005555555bd89b in DAQ_New (sc=0x55555695a960, intf=0x55555704ef50 "dpdk0") at sfdaq.c:553
#6  0x0000555555595a4e in SnortMain (argc=12, argv=0x7fffffffe578) at snort.c:875
#7  0x00005555555959a4 in main (argc=12, argv=0x7fffffffe578) at snort.c:836

I have been unable to attribute this error to any particular reason, as onvm_nflib_init_nf_init_cfg() takes as input the NF_TAG.

I've tried to run the scaling example, provided with openNetVM, which runs as expected.

Any tips or insights would be greatly appreciated!

Edit: Added link to repo

koolzz commented 4 years ago

Could you possibly enable debugging flags and disable compiler optimization so that we can check what specifically segfaults?

We should have instructions in the readme here https://github.com/sdnfv/openNetVM/blob/master/docs/Debug.md

koolzz commented 4 years ago

Possibly nf tag isn't properly passed in so it segfaults

archit-p commented 4 years ago

Thanks for the suggestion! I recompiled onvm and dpdk with debug flags on, the backtrace is more detailed now:

(gdb) backtrace
#0  0x0000000000000000 in ?? ()
#1  0x00005555556a1abc in rte_mempool_ops_dequeue_bulk (mp=0x100334300, obj_table=0x7fffffffdee0, n=1)
    at /home/student/sources/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:657
#2  0x00005555556a9ed9 in __mempool_generic_get (cache=0x0, n=1, obj_table=0x7fffffffdee0, mp=0x100334300)
    at /home/student/sources/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1391
#3  rte_mempool_generic_get (cache=0x0, n=1, obj_table=0x7fffffffdee0, mp=0x100334300)
    at /home/student/sources/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1426
#4  rte_mempool_get_bulk (n=1, obj_table=0x7fffffffdee0, mp=0x100334300) at /home/student/sources/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1459
#5  rte_mempool_get (obj_p=0x7fffffffdee0, mp=0x100334300) at /home/student/sources/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1485
#6  onvm_nflib_init_nf_init_cfg (tag=0x555555752aa0 "snort") at /home/student/sources/openNetVM/onvm/onvm_nflib/onvm_nflib.c:756
#7  0x00005555556a373d in onvm_nflib_init (argc=6, argv=0x7fffffffe040, nf_tag=0x555555752aa0 "snort", nf_local_ctx=0x555556bbaec0, nf_function_table=0x0)
    at /home/student/sources/openNetVM/onvm/onvm_nflib/onvm_nflib.c:348
#8  0x000055555569e4ed in netvm_daq_initialize (config=0x7fffffffe3c0, ctxt_ptr=0x5555561010b0 <daq_hand>, errbuf=0x7fffffffe290 "", errlen=256) at daq_netvm.c:232
#9  0x00005555555bd1f4 in DAQ_Config (cfg=0x7fffffffe3c0) at sfdaq.c:515
#10 0x00005555555bd32b in DAQ_New (sc=0x555556971960, intf=0x555557063e40 "dpdk0") at sfdaq.c:553
#11 0x00005555555954de in SnortMain (argc=12, argv=0x7fffffffe548) at snort.c:875
#12 0x0000555555595434 in main (argc=12, argv=0x7fffffffe548) at snort.c:836

I'm passing NF_TAG, in a manner similar to other example NFs (by defining it at the top). Code

Could it be an issue that the rte_mempool is not initialized? I suspect so because the value of rte_mempool_ops_table.num_ops is 0. If this is the case, how could I go about initializing the mempool?

Edit: Could it be an issue with DPDK libraries not getting linked correctly?

koolzz commented 4 years ago

Thanks for enabling those, indeed this is a strange case. It looks like the code is fine, we check creation success here.

I'm wondering if its like you said a DPDK library related issue. Are you using the same dpdk version? We do have a -a 0x7f000000000 flag for onvm_mgr, could you try using that? Additionally try using the dpdk script for setting hugepage NUMA memory(try to give it more memory).

archit-p commented 4 years ago

Thanks again for the inputs! I've manually setup DPDK to v18.11, as described in installation guide. So that is unlikely to be the issue.

Tried using the both -a 0x7f000000000 and -v 0x7f000000000 as described in debug guide, while running the onvm_mgr, however these didn't change the error.

That daq_netvm cannot access the mempool allocated by onvm_mgr, continues to perplex me. I'm fairly convinced this is the issue, since as mentioned before rte_mempool_ops_table.num_ops is 0 when accessed from the code. On the other hand, I included a check for rte_mempool_ops_table.num_ops in the scaling_example and its value was non-zero there.

koolzz commented 4 years ago

Hmm, is the daq compiled with the same version of dpdk as the onvm mgr? I'll ask in our dev slack channel to see if anyone has seen this error.

archit-p commented 4 years ago

Thanks a lot for putting this question across.

I could solve the issue after digging through stuff online. Turns out it was in-fact an issue with the DPDK libraries being linked incorrectly. DAQ was linked against only libdpdk.a and not the other libraries under ${RTE_SDK}/${RTE_TARGET}/lib/. This would allow onvm-snort to build, but return a segfault when any functions ex. rte_mempool_get were called.

When I updated the DPDK_LDFLAGS in the configure.ac to include all the libraries, and set Snort to build only static libraries, I could get it to run alright.

Any suggestions on how to test the code? I'll create a pull request on https://github.com/sdnfv/onvm-snort once confident about it working as expected.

@koolzz I really appreciate the help you offered!

koolzz commented 4 years ago

Great to hear @archit-p!

Regarding testing @twood02 might have better suggestions but I think sending a special PCAP(does snort provide any?) file should work to showcase that snort is working.

twood02 commented 4 years ago

@archit-p - glad you fixed the issue. Testing with a pcap would be easiest if you can easily generate one that will trigger your snort rules. Otherwise there are some tools you can try like https://github.com/SavSanta/idswakeup/blob/master/README which are supposed to help generate traffic to meet a specific snort ruleset. We haven't used that, but it seems the best way to validate it is working.

Let us know if you hit any other issues.

archit-p commented 4 years ago

Hello @twood02 I performed some basic tests to validate snort works.

Have created a pull request on the onvm-snort repository. https://github.com/sdnfv/onvm-snort/pull/8 Please let me know if any changes are required.