sdnfv / openNetVM

A high performance container-based NFV platform from GW and UCR.
http://sdnfv.github.io/onvm/
Other
262 stars 136 forks source link

Communication between network functions present in two servers is failing after few packet exchanges #95

Closed madhura-a closed 5 years ago

madhura-a commented 5 years ago

Bug Report

Current Behavior I am using two servers with 10G ports and they are connected with each other. I have one network function in Server 1 and two network functions in Server 2. There is a need for exchanging a lot of packets between two servers. So currently behavior is as follows,

  1. Server 1 sending Message 1 to Server 2 --> working fine
  2. Server 2 sending Message 2 to Server 1 --> working fine
  3. Server 1 sending Message 3 to Server 2 --> working fine
  4. Server 2 sending Message 4 to Server 1 --> not working, from manager's statistics display, it says Message 4 has transmitted to Server 1. But in Server1, manager itself doesn't receive Message 4.

I am using latest onvm code. What could be the problem? Can you please help me out to resolve this issue?

koolzz commented 5 years ago

Hi @madhura-a,

By messages you mean just custom packets correct?

This is indeed strange behavior, if packets were successfully sent from server 2 to server 1 then there shouldn't be any issues with the further communication. You're seeing no packets RX on the port that the packet is supposed to arrive at? And Server 2 port TX shows successful transmission?

You can also confirm that the packet was actually recieved on the port by using pdump

To use dpdk-pdump set CONFIG_RTE_LIBRTE_PMD_PCAP=y in dpdk/config/common_base and then recompile dpdk.
Then execute dpdk-pdump as a secondary application when the manager is running (adjust port if needed)

cd dpdk/x86_64-native-linuxapp-gcc
sudo ./build/app/pdump/dpdk-pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap,tx-dev=/tmp/tx.pcap'

Full set of options and configurations for dpdk-pdump can be found here.

madhura-a commented 5 years ago

Yes. Messages mean custom packets. Server 1's statistics:

server1

Server 2's statistics:

server2

As we can see port 0 of server 2 says it has transmitted 2 packets and port 0 of server 1 has received only one.

I will check with dpdk-pdump once and confirm packet is actually receieved or not.

madhura-a commented 5 years ago

I have collected rx and tx pcap files. In the pcap files also, it says server 2 has transmitted 2 packets and server 1 has received only one packet. The network functions I have written are in c++. I followed dpdk c++ patch file for c++ support following the link [https://patches.dpdk.org/patch/15103/ ] . Is this causing the issue?

koolzz commented 5 years ago

Well if the pdump doesn't report 2 rx packet on server 1 that means its definitely not the onvm_mgr losing the packets. I'm not sure why the packet is being dropped in transition from server 2 -> server 1 especially as they are directly connected.

We didn't test any C++ NFs yet but as you see the packet being sent from server 2 that is probably not the problem.

Can you try sending more messages? Does this behavior continue?

madhura-a commented 5 years ago

Yes. If I send 20 packets from server 2, server 1 receives only 10 packets. I have one basic question. When the servers and directly connected with each, does source MAC address and destination MAC address have any significance?

koolzz commented 5 years ago

Yeah it might be dropping them if you're not setting the mac properly

madhura-a commented 5 years ago

It is working fine now. I was not setting the MAC address properly. But the question is how it was able to transmit the packets initially, without any MAC address? Also, I have one more doubt, what does ret in the NF statistics means?

koolzz commented 5 years ago

@madhura-a Sorry for the late reply, didn't see the notification for your message. In case you still need this:

The dropping is weird is done by the NIC I think, can't tell why exactly it drops some macs but allows others. Regarding ret action, the NF can use it to send packets, whenever we call onvm_nflib_return_pkt or onvm_nflib_return_pkt_bulk that stat is incremented, its placing the packets onto the current NF tx queue, which are then sent to the specified target(which can be another NF or a NIC port).

madhura-a commented 5 years ago

Thank you for your reply. With proper MAC headers, inter-server packet transmission is working fine. So closing this issue.