amzn / amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
457 stars 176 forks source link

[Support]: DPDK issues on AWS #293

Closed AnkitR4 closed 8 months ago

AnkitR4 commented 10 months ago

Preliminary Actions

Driver Type

Linux kernel driver for Elastic Network Adapter (ENA)

Driver Tag/Commit

ena_linux_2.8.5-1-g22fca73

Custom Code

No

OS Platform and Distribution

Linux 5.10.201-191.748.amzn2.x86_64

Support request

Hey there, I have installed dpdk version 24.03.0 on c5.4xlarge and I've attached 2 additional interfaces to this instances (eth1 and eth2) These two interfaces are in two different subnets (10.0.1.0/24(subnet A) and 10.0.2.0/24(subnet B) respectively) I am currently running l3fwd acl on this device whth rule set to forward all traffic from eth1 to eth2 and vice versa, i have 2 t2micro instances, 1 running in 10.0.1.0/24 subnet and other in 10.0.2.0/24 subnet, I've also added routes on subnet A specifying that is destination is subnet b target is eth1 and route on subnet B specifying is destination is subnet A target is eth2 . is already running l3fwd acl My question is if i try to ping the instance running in sub net B from the instance running in subnet A I am not getting a response, also how do you send traffic to a dpdk controlled interface in AWS, i have binded my interface to igb_uio driver.
ALSO, when the l3fwd-acl application is running, i get the below for link status: Checking link statusdone Port 0 Link up at None FDX Fixed Port 1 Link up at None FDX Fixed

Contact Details

ankith@xtennetworks.com

shaibran commented 10 months ago

Hi AnkitR4 We did not check l3fwd acl although it should function. We will consider adding it to our coverage .

If ping is not passing, you need to check in which link it failed t2 with ENI on subnet1 => c5 with ENI subnet1 > l3fwd app > ENI on subnet 2 => t2 with ENI on subnet2.

it might be issue with security groups. ec2 configuration includes SG (Security groups) and ACL (Access control list) that allow you to control access to AWS resources within your VPC. But SG allow you to control inbound and outbound traffic at the instance level. Try to use the default ACL configuration and in the SG open all inbound/outbound traffic for the three instances.

Please check our readme, it contains quick-start section, best practices and instruction on how to enable ENA logger.

BR, Shai

shaibran commented 9 months ago

Hi AnkitR4 any update on your setup debug?

AnkitR4 commented 9 months ago

Hi @shaibran , sorry for the late reply, I was trying to debug my problem, I have configured SG to allow all traffic, so I dont think that was the issue, however i realized that the default l3fwd acl sample application would drop the ARP packets and hence the ping was not working. Now I am trying with Cisco Trex Traffic generator to pump traffic , I've added an image with a modified topology, Here in this new topology, I'm running trex on instance B , and pumping traffic from 10.0.1.22 , and added route so as the traffic is directed to 10.0.1.21 ( interface of instance A , which is running dpdk l3fwd acl ) , according to theory, packets should be forwarded to 10.0.2.21 ( interface of instance A , which is running dpdk l3fwd acl ) and route table is added , so as to direct trafiic to 10.0.2.22 ( interface of instance B which is running Trex ) The same problem persists, I'm not able to receive packets on the other terminal of Trex, SG groups allow all traffic AWS (2019) horizontal framework

shaibran commented 9 months ago
  1. please share the region, instance-IDs, relevant ENI-IDs and we will look into the ec2 logs to verify there are no clear drops.
  2. you can query stats/xstats that should show you the queues' status and if any drops occurred
  3. verify that aws acl allow all traffic
AnkitR4 commented 9 months ago

@shaibran Region - US East (N. Virginia) us-east-1 Instance IDs - i-00dd74d166bf5e5df (instance running dpdk), i-06281bbe77d16fd18 ( instance running Trex) ENI-IDs for instance running dpdk - eni-06996f6975640864f , eni-0dd105cb467ad7bbd , eni-06996f6975640864f ENI-IDs for instance running Trex - eni-0036c9a403b773119 , eni-0d11b6fbda61709b7 , eni-085e2175e8bd0e83e

shaibran commented 9 months ago

AnkitR4 Can you please share a 24 hour time frame that I can focus on?