tbarbette / fastclick

FastClick - A faster version of the Click Modular Router featuring batching, advanced multi-processing and improved Netmap and DPDK support (ANCS'15). Check the metron branch for Metron specificities (NSDI'18). PacketMill modifications (ASPLOS'21) as well as MiddleClick(ToN, 2021) are merged in main.
Other
279 stars 81 forks source link

rte_mempool segmentation fault. #359

Closed ali64mohammad6464 closed 2 months ago

ali64mohammad6464 commented 2 years ago

Hi i have a primary dpdk process that create mempool and i want to use it in fastclick (as secondary process) . this are for creating mempool in primary: rte_mempool_create("test_pool", 1000000 ,50, 0, 0, NULL, NULL, NULL, NULL, rte_socket_id(),0);

if i write a sample secondary process:

if ((retval = rte_eal_init(argc, argv)) < 0)
        return -1;
    argc -= retval;
    argv += retval;

    struct rte_mempool * testm = rte_mempool_lookup("test_pool");
    printf("click %p use_count %d  count %d \n", testm, rte_mempool_in_use_count(testm), rte_mempool_avail_count(testm));

it prints the count of mempool correctly.

and now I want to use click /fastclick: I add these lines to click.cc in userlevel (just for test):

# endif //HAVE_DPDK
    {
        for (int t = 1; t < click_nthreads; ++t) {
            pthread_t p;
            pthread_create(&p, 0, thread_driver, click_master->thread(t));
            other_threads.push_back(p);
            do_set_affinity(p, t, args);
        }
        do_set_affinity(pthread_self(), 0, args);
    }
#endif
    struct rte_mempool * testm = rte_mempool_lookup("test_pool");
    printf("click %p use_count %d   \n", testm, rte_mempool_in_use_count(testm));

I add the last two lines. output:


EAL: Detected 2 NUMA nodes
EAL: Detected static linkage of DPDK
[New Thread 0x7ffff7a60700 (LWP 71707)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_71703_4cb04a0762802
[New Thread 0x7ffff725f700 (LWP 71708)]
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
EAL: Probe PCI driver: net_ixgbe (8086:10fb) device: 0000:12:00.0 (socket 0)
EAL: Probe PCI driver: net_ixgbe (8086:10fb) device: 0000:12:00.1 (socket 0)
EAL: Probe PCI driver: net_e1000_igb (8086:1521) device: 0000:5d:00.2 (socket 0)
EAL: Probe PCI driver: net_e1000_igb (8086:1521) device: 0000:5d:00.3 (socket 0)
[New Thread 0x7ffff6a5e700 (LWP 71709)]
EAL: No legacy callbacks, legacy socket not created

Thread 1 "click" received signal SIGSEGV, Segmentation fault.
0x0000555555a72c8b in bucket_get_count ()
(gdb) bt
#0  0x0000555555a72c8b in bucket_get_count ()
#1  0x0000555555c15e51 in rte_mempool_avail_count ()
#2  0x0000555555c16439 in rte_mempool_in_use_count ()
#3  0x00005555557fc058 in main ()

I tried both click and fast click and both will show this error. but a sample dpdk secondary process is ok. and when I other dpdk tools like dpdk ring, it works fine but rte_mempool function will show these errors.

tbarbette commented 2 years ago

Mhh I wonder if it's not interfering with click's packet pool. Try adding the test code at the end of https://github.com/tbarbette/fastclick/blob/d1233fdeb558788af9b45c03b50f6645618d7c50/lib/dpdkdevice.cc#L271

Click's DPDK code is a bit deprecated and less feature rich. If using DPDK, stay in FastClick :)

ali64mohammad6464 commented 2 years ago

i add code to end of that function, still segmentation fault.

#0  0x0000555555a72c5b in bucket_get_count ()
#1  0x0000555555c15e21 in rte_mempool_avail_count ()
#2  0x0000555555c16409 in rte_mempool_in_use_count ()
#3  0x000055555576c3cd in DPDKDevice::alloc_pktmbufs(ErrorHandler*) ()
#4  0x000055555576c44b in DPDKDevice::static_initialize(ErrorHandler*) ()
#5  0x000055555576c4b4 in DPDKDevice::initialize(ErrorHandler*) [clone .cold.91] ()
#6  0x0000555555756ba5 in FromDPDKRing::initialize(ErrorHandler*) ()
#7  0x00005555559f8238 in Router::initialize(ErrorHandler*) ()
#8  0x000055555599c603 in parse_configuration(String const&, bool, bool, click_args_t&, ErrorHandler*) ()
#9  0x00005555557faf47 in main ()
tbarbette commented 2 years ago

Maybe this is due to different DPDK linking. Is your "test" secondary also statically linked? No other globally installed DPDK left? or dangling old $RTE_SDK?

ali64mohammad6464 commented 2 years ago

no it is dynamically linked. can i link click/fastclick dynamiclly?

tbarbette commented 2 years ago

Configure FastClick with --enable-dynamic-linking

ali64mohammad6464 commented 2 years ago

yes now it works. thanks a lot.

ali64mohammad6464 commented 2 years ago

Hi this problem fixed in fastclick, but i need to fix it in click too. i updated dpdk.mk in click file as below:

ifeq ($(shell [ -n "$(RTE_VER_YEAR)" ] && ( [ "$(RTE_VER_YEAR)" -ge 20 ] && [ "$(RTE_VER_MONTH)" -ge 11 ] ) && echo true),true)
$(info $$RTE_SDK_BIN is [${RTE_SDK_BIN}])
$(info $$PKG_CONFIG_PATH is [${PKG_CONFIG_PATH}])
$(info $$RTE_VER_YEAR is [${RTE_VER_YEAR}])
$(info $$RTE_VER_MONTH is [${RTE_VER_MONTH}])
PKG_CONFIG_PATH = $(RTE_SDK_BIN)/meson-uninstalled
$(info $$PKG_CONFIG_PATH is [${PKG_CONFIG_PATH}])

PKGCONF ?= pkg-config
CXXFLAGS += -O3 $(shell export PKG_CONFIG_PATH='$(PKG_CONFIG_PATH)'; $(PKGCONF) --cflags libdpdk) 
LDFLAGS += $(shell export PKG_CONFIG_PATH='$(PKG_CONFIG_PATH)'; $(PKGCONF) --libs --static libdpdk)

else

ifeq ($(RTE_SDK),)
$(error "Please define RTE_SDK environment variable")
endif
...

by adding this codes to dpdk.mk click linked to dpdk statically and there is no problem. but now i want to link dynamically, so i remove --static from LDFLAGS but rte libraries not linked. (i saw undefined reference, and checked with ldd, rte libs are not linked) i figure out problem is that output of $(shell export PKG_CONFIG_PATH='$(PKG_CONFIG_PATH)'; $(PKGCONF) --libs libdpdk command has --wl,--as-needed, these flags prevent to link dynamically, if I copy the output of the command and remove --wl,--as-needed, linking is successfully. Can you tell me a way to fix it in click too? thanks.

tbarbette commented 2 months ago

All the dpdk.mk stuff is now deprecated, so closing this.