ansyun / dpdk-ans

ANS(Accelerated Network Stack) on DPDK, DPDK native TCP/IP stack.
https://ansyun.com
BSD 3-Clause "New" or "Revised" License
1.15k stars 322 forks source link

Unable to run ANS-startup example #96

Closed mhabhishek closed 5 years ago

mhabhishek commented 5 years ago

Hi, I am trying to run ANS on SLES 12 SP4 with dpdk-stable-18.11.1-rc1. When I try to run the ANS startup example using the command ./ans -c 0x4 -n 1 -- -p 0x1 --config="(0,0,2)" the following is the result-

EAL: Detected 4 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Probing VFIO support... EAL: PCI device 0000:03:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 15ad:7b0 net_vmxnet3 EAL: PCI device 0000:0b:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 15ad:7b0 net_vmxnet3

Start to Init port port 0: port name net_vmxnet3: max_rx_queues 16: max_tx_queues:8 rx_offload_capa 0x281f: tx_offload_capa:0x802f Creating queues: rx queue number=1 tx queue number=1... MAC Address: xx:xx:xx:xx:19:59 Deault-- tx pthresh:0, tx hthresh:0, tx wthresh:0, tx offloads:0x0 lcore id:2, tx queue id:0, socket id:0 Conf-- tx pthresh:0, tx hthresh:0, tx wthresh:0, tx offloads:0x2e

Allocated mbuf pool on socket 0, mbuf number: 16384

Initializing rx queues on lcore 2 ... Default-- rx pthresh:0, rx hthresh:0, rx wthresh:0, rx offloads:0x0 Conf-- rx pthresh:0, rx hthresh:0, rx wthresh:0, rx offloads:0xf port id:0, rx queue id: 0, socket id:0

core mask: 4, sockets number:1, lcore number:1 start to init ans USER8: LCORE[2] lcore mask 0x4 USER8: LCORE[2] lcore id 2 is enable USER8: LCORE[2] lcore number 1 USER8: LCORE[2] lcore(2)'s sockets(0) hasn't mbuf pool EAL: Error - exiting with code: 1 Cause: Init ans failed

How do I overcome this error?

bluenet13 commented 5 years ago

Do you change ans_main.c code? Print init_conf.pktmbuf_pool[i] value in ans_main.c and check if init_conf.pktmbuf_pool[0] is null. for(i = 0 ; i < MAX_NB_SOCKETS; i++) { init_conf.pktmbuf_pool[i] = ans_pktmbuf_pool[i]; }

mhabhishek commented 5 years ago

I modified the ans_main.c code to print the init_conf.pktmbuf_pool[i] used %u as the format specifier. Got the following output-

Allocated mbuf pool on socket 0, mbuf number: 16384

Initializing rx queues on lcore 2 ... Default-- rx pthresh:0, rx hthresh:0, rx wthresh:0, rx offloads:0x0 Conf-- rx pthresh:0, rx hthresh:0, rx wthresh:0, rx offloads:0xf port id:0, rx queue id: 0, socket id:0

core mask: 4, sockets number:1, lcore number:1 start to init ans init_conf.pktmbuf_pool[0]= 4710144 init_conf.pktmbuf_pool[1]= 0 init_conf.pktmbuf_pool[2]= 0 init_conf.pktmbuf_pool[3]= 0 init_conf.pktmbuf_pool[4]= 0 init_conf.pktmbuf_pool[5]= 0 init_conf.pktmbuf_pool[6]= 0 init_conf.pktmbuf_pool[7]= 0 USER8: LCORE[2] lcore mask 0x4 USER8: LCORE[2] lcore id 2 is enable USER8: LCORE[2] lcore number 1 USER8: LCORE[2] lcore(2)'s sockets(0) hasn't mbuf pool EAL: Error - exiting with code: 1 Cause: Init ans failed

init_conf.pktmbuf_pool[0] is not NULL but subsequent elements of that array are zero.

bluenet13 commented 5 years ago

init_conf.pktmbuf_pool[0] is not NULL, why print such error? they are the same memory address. USER8: LCORE[2] lcore(2)'s sockets(0) hasn't mbuf pool.

Please use dpdk-18.11 version.

mhabhishek commented 5 years ago

I am using dpdk-stable-18.11.1 version itself.

bluenet13 commented 5 years ago

Try dpdk-18.11 version, because ans libs are compiled based on dpdk-18.11 version.

By the way, How did you choose ans libs?

mhabhishek commented 5 years ago

I ran ./install_deps.sh got the message "Generate librte_ans.a/librte_anssock.a/librte_anscli.a for sandybridge successfully."

mhabhishek commented 5 years ago

Using dpdk-18.11 version still getting the same error.

bluenet13 commented 5 years ago

Don't know the root cause. the memroy address is same, but print out different value. You can try on on other linux system with bare metal.

mhabhishek commented 5 years ago

Ok Thank you.

bluenet13 commented 5 years ago

Any update?

mhabhishek commented 5 years ago

Now, i am trying to run ANS on Ubuntu 18.04 on VM with dpdk-18.11. When i run the ANS startup example using the command ./ans -c 0x4 -n 1 -- -p 0x1 --config="(0,0,2)" the getting the following output- EAL: Detected 6 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:03:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 15ad:7b0 net_vmxnet3 EAL: PCI device 0000:0b:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 15ad:7b0 net_vmxnet3 Start to Init port port 0: port name net_vmxnet3: max_rx_queues 16: max_tx_queues:8 rx_offload_capa 0x281f: tx_offload_capa:0x802f Creating queues: rx queue number=1 tx queue number=1... MAC Address: xx:xx:xx:xx:14:A6 Deault-- tx pthresh:0, tx hthresh:0, tx wthresh:0, tx offloads:0x0 lcore id:2, tx queue id:0, socket id:0 Conf-- tx pthresh:0, tx hthresh:0, tx wthresh:0, tx offloads:0x2e Allocated mbuf pool on socket 0, mbuf number: 16384 Initializing rx queues on lcore 2 ... Default-- rx pthresh:0, rx hthresh:0, rx wthresh:0, rx offloads:0x0 Conf-- rx pthresh:0, rx hthresh:0, rx wthresh:0, rx offloads:0xf port id:0, rx queue id: 0, socket id:0 core mask: 4, sockets number:1, lcore number:1 start to init ans USER8: LCORE[2] lcore mask 0x4 USER8: LCORE[2] lcore id 2 is enable USER8: LCORE[2] lcore number 1 USER1: rte_ip_frag_table_create: allocated of 25165952 bytes at socket 0 add veth0 device, kni id 0 USER8: LCORE[2] Interface veth0 if_capabilities: 0x802f add IP a000002 on device veth0 show all IPs: veth0: mtu 1500 link/ether xx:xx:xx:xx:14:a6 inet addr: 10.0.0.2/24 add static route ANS IP routing table 10.0.0.0/24 via dev veth0 src 10.0.0.2 10.10.0.0/24 via 10.0.0.5 dev veth0 Checking link status done Port 0 Link Up - speed 10000 Mbps - full-duplex USER8: main loop on lcore 2 USER8: -- lcoreid=2 portid=0 rxqueueid=0 hz: 2100001375 I am wondering if the output stops at hz: 2100001375 or should there be anything more. I have added ip address to veth0 using anscli. The output of ip addr show is- veth0: mtu 1500 link/ether xx:xx:xx:xx:14:a6 inet addr: 10.0.0.2/24 inet addr: xx.xx.xx.76/23 I am not able to ping veth0 using xx.xx.xx.76 from other machine as well as within the VM. What should be done to correct this?

Thanks.

bluenet13 commented 5 years ago

ans startup ok. You can use "ip link show", "ip neigh show" to check it. or tcpdump on other mechine.

bluenet13 commented 5 years ago

Check your network by yourself.

mhabhishek commented 5 years ago

Hi, This is the output of ip neigh show on anscli- ans> ip neigh show ANS IP neigh table xx.xx.xx.254 dev veth0 lladdr xx:xx:xx:xx:76:00 REACHABLE xx.xx.xx.213 dev veth0 lladdr xx:xx:xx:xx:19:4f REACHABLE xx.xx.xx.74 dev veth0 lladdr xx:xx:xx:xx:14:9c REACHABLE xx.xx.xx.255 dev veth0 lladdr 0:0:0:0:0:0 INCOMPLETE It tells that IP xx.xx.xx.213 is reachable from veth0. I am running linux_tcp_server on xx.xx.xx.213 and i am running dpdk_tcp_client on xx.xx.xx.76(veth0). But i am not able to connect to the server getting the following output- EAL: Detected 6 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_9828_e2385d38774 EAL: Probing VFIO support... EAL: VFIO support initialized USER8: LCORE[-1] anssock any lcore id 0xffffffff USER8: LCORE[0] anssock app id: 9828 USER8: LCORE[0] anssock app name: dpdk_tcp_client USER8: LCORE[0] anssock app lcoreId: 0 USER8: LCORE[0] mp ops number 9, mp ops index: 0 fd(28) connect to remote server failed fd(37) connect to remote server failed fd(36) connect to remote server failed fd(35) connect to remote server failed fd(34) connect to remote server failed fd(33) connect to remote server failed fd(32) connect to remote server failed fd(31) connect to remote server failed fd(30) connect to remote server failed fd(29) connect to remote server failed all fd connect to server failed Do you know what the problem could be? Thanks.

bluenet13 commented 5 years ago

According to " tx_offload_capa:0x802f ", the net_vmxnet3 support checksum offload. But according to the pcap file, all the checksum is wrong. You can disable the checksum offload by modify ans_set_port_offload() in ans_main.c.