bobzhuyb / ns3-rdma

NS3 simulator for RDMA over Converged Ethernet v2 (RoCEv2), including the implementation of DCQCN, TIMELY, PFC, ECN and shared buffer switch
GNU General Public License v2.0
260 stars 119 forks source link

WARNING: Drop because egress Q buffer full #28

Open kg12 opened 5 years ago

kg12 commented 5 years ago

I use Bcube as my network topology. It has 8 switches and 16 nodes. This error occurred when I ran the simulation. Could you tell me why? How can I solve this problem? Thank you very much.

bobzhuyb commented 5 years ago

Packet drops usually mean that PFC is not working, or the buffer threshold is not properly configured (if you ever modified it.)

If you didn't modify any code, would you please upload your topo and flow configuration files?

kg12 commented 5 years ago

topo flow.txt topology.txt I only changed the flow and topology configuration files.The topology is showed in the picture.Thank you for your patience.

bobzhuyb commented 5 years ago

In current implementation, I didn't expect a server can have multiple NICs and forward the traffic.. It could turn into some weird states. But in general, packet drops are because incorrect PFC settings.

kg12 commented 5 years ago

Thank you for your reply. If I want the server to support multiple NIC, where should I modify it? And if I change these servers into switches and then connect these switches to only one node, is that OK?

bobzhuyb commented 5 years ago

It's not clear to me how the servers may manage packet buffer and trigger PFC/ECN. I think what you propose is a quick way to workaround the issue and make the simulator work, though I am not clear whether BCube properties will still hold either.

kg12 commented 5 years ago

Thank you for your answer. In real applications, may a server have multiple RDMA NICs?If the server has more than one NIC, which part of the code should I modify?

bobzhuyb commented 5 years ago

In real applications, it's okay to have multiple RDMA NICs. The real question is whether you'll forward packets between different RDMA NICs, i.e., receiving packets from one NIC and then use the other NICs to send those packets out again.

If the answer is no, then it's equivalent to multiple hosts and each has just one NIC (from networking point of view). If the answer is yes --- I am sorry I never saw people use RDMA NICs like that in real life, and it's unclear to me whether the commodity RDMA solutions can do this without some dirty hacking.

ninh006 commented 3 years ago

Packet drops usually mean that PFC is not working, or the buffer threshold is not properly configured (if you ever modified it.)

If you didn't modify any code, would you please upload your topo and flow configuration files?

I use the default config file but it still shows WARNING: Drop because egress Q buffer full # when i runing timely,do i need to change some configuration when i runing timely