sdnfv / openNetVM

A high performance container-based NFV platform from GW and UCR.
http://sdnfv.github.io/onvm/
Other
261 stars 134 forks source link

When Congestion occurs with shared core, number of pkt processing is abnormal #275

Closed JackKuo-tw closed 3 years ago

JackKuo-tw commented 3 years ago

Bug Report

Current Behavior

When there are 2 NFs share with a single CPU core, if the total CPU usage goes up to 100%, the packet drops reasonably.

But in my test, the tx_pps & rx_pps drop drop sharply,

image

Expected behavior/code

In this scenario, CPU usage is only about 70%, controlled by pktgen's packet per second.

image

Steps to reproduce

Environment

Possible Solution

I cannot find the core problem...

Additional context/Screenshots

dennisafa commented 3 years ago

Thanks for the bug report. I think that one way to solve this is to increase the RX queue size for the NF's. Since the CPU utilization is high, the NF cannot process the current batch of packets at a high enough rate before the next batch comes in from pktgen. This results in drops since the RX queue is full.

dennisafa commented 3 years ago

Were you able to solve this issue, or do you require further assistance?

JackKuo-tw commented 3 years ago

@dennisafa I tried to change the value of NF_MSG_QUEUE_SIZE from 128 to 65536 in onvm_mgr/onvm_init.h, but it seemed nothing happend.

Maybe the flow is still too large, this change can only afford the burst (not test yet)

JackKuo-tw commented 3 years ago

Is there any fundamental way to solve this issue?

dennisafa commented 3 years ago

@dennisafa I tried to change the value of NF_MSG_QUEUE_SIZE from 128 to 65536 in onvm_mgr/onvm_init.h, but it seemed nothing happend.

Maybe the flow is still too large, this change can only afford the burst (not test yet)

Thats a different queue size constant - it's for inter-nf messages, not packets. There should be a different constant that is the RX queue size. Here it is: https://github.com/sdnfv/openNetVM/blob/master/onvm/onvm_nflib/onvm_common.h#L71 @JackKuo-tw

JackKuo-tw commented 3 years ago

@dennisafa Thanks. It doesn't work for me. BTW, ringsize is limited to unsigned (4 bytes), but millions of packets per second, does it matter?