amzn / amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
456 stars 176 forks source link

prepare_ctx_err is observed on ENA DPDK interfaces on c5n.4xlarge instances when NIC is configured with more than 12 Rx/Tx queues #149

Closed sasingam closed 3 years ago

sasingam commented 3 years ago

Hi Team, We are running our custom application based on DPDK 20.08 version which receives packets on one ENA interface and forwards the packets out over other ENA interface (our application works similar to l3fwd). We are pumping traffic to this application with PPS rate of 4.2Million packet per second (i.e both interfaces will receive 2.1 Mpps per interface and forwards the traffic over other interface). We are using both ENA interfaces with LLQ enabled.
During the test, we are observing that no errors are seen and ENA interface is able to process the RX and TX traffic when the interfaces are configured with 12 RX/TX queues. When the NIC is configured with more than 12 RX queues , prepare_ctx_err are getting incremented and packets are getting dropped.
With 13 RX/TX queues, prepare_ctx_err are seen on queue 0 and queue 12. With 14 RX/TX queues, prepare_ctx_err are seen on queue 0, queue1, queue12 and queue13 Also when the NIC is configured with 8 queues or less, prepare_ctx_err is seen on all the queues and packets are dropped So in brief, working queue combination is RX/TX queues >=9 and RX/TX queues <= 12 In our case, RSS is working fine and there is fair distribution of packets among RX, TX queues.

On further debugging, we observed that packets are dropped during enqueue into TX queue as there is no enough space in the TX queue i.e there are no free descriptors in the TX queue. TX queue descriptors are not even freed/available even after a call ena_tx_cleanup. TX queues is using 1024 descriptors in our case.

Is there any limitation with ENA interfaces with respect to the number of RX/TX queues configured?

I-gor-C commented 3 years ago

Hi @sasingam Thanks for reaching out. I'll check this and update.

sasingam commented 3 years ago

Thanks @I-gor-C

I-gor-C commented 3 years ago

What kind of traffic are you sending? Please indicate protocol, mtu. Are you targeting to get higher PPS by increasing the queues number?

sasingam commented 3 years ago

Hi @I-gor-C , I am running UDP traffic with ENA DPDK interface configured to receive jumbo frames(9000 Bytes). But I am sending packets of size less than 256Bytes. My query is

  1. If 2.1M PPS can be achieved with 12queues, it should not result in any errors when more number of queues are used ?
  2. Does PPS depend on the number of queues configured ? If yes, what is the ideal queue count to achieve maximum PPS without any errors/drops ?
I-gor-C commented 3 years ago

Hi @sasingam Let's proceed via email. Could you please send me the instance id to igorch@amazon.com ?

Thanks

I-gor-C commented 3 years ago

Fixed in DPDK 21.02