amzn / amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
457 stars 176 forks source link

Feature request: add support for n-tuple filtering #206

Closed aneagoe closed 4 weeks ago

aneagoe commented 2 years ago

Hello,

We're looking for an easier way to use kernel bypass together with the ENA driver. For that, openonload would be a perfect fit. However, at the moment, it doesn't work due to ENA's lack of support for n-tuple filtering. How feasible would it be to add such a feature? For more background, please see https://github.com/Xilinx-CNS/onload/issues/28.

akiyano commented 2 years ago

Hi @aneagoe, Thanks for your request. n-tuple filtering is in our long term plans, but there isn't a concrete timeline yet.

nirvana-msu commented 1 year ago

There are other use-cases for n-tuple filtering support beside AF_XDP/kernel bypass with OpenOnload.

Accelerated RFS also requires it. This is what can let you achieve the best latency while still using kernel for handling TCP stack. It does it by routing packet directly to the correct NIC queue / correct CPU your application needs this packet on, rather than assigning queue randomly based on hash (which normally would mean an extra IPI call, even if you use regular RFS).

Btw, another requirement for aRFS is that NIC must support ndo_rx_flow_steer() net_device function. Is it documented anywhere for AWS EC2 ENA (or is there any way to check it from the running instance)? Specifically interested in this for c6g instances.

@davidarinzon / @akiyano I'm wondering if n-tuple filtering support could be prioritized for ENA driver?

Thanks.

davidarinzon commented 1 year ago

Hi @nirvana-msu Thank you for sharing this information and the additional use-cases. You're welcome to reach me out at darinzon@amazon.com for further discussion.

Regarding ndo_rx_flow_steer(), this option is not implemented by the driver at this point.

ShakesB33r commented 1 year ago

+1 to being able to benefit from both aRFS as well as openonload, would be nice to see the enhancements mentioned in this thread implemented.

FrothyB commented 1 year ago

Are there any plans to implement these features or should we assume that they will not be landing anytime soon?

davidarinzon commented 12 months ago

Hi @FrothyB

We are looking into the option of adding n-tuple support, but I can't provide a timeframe at this point. Stay tuned for ENA Linux driver releases and release notes.

davidarinzon commented 2 months ago

Hi,

We've added AF_XDP support as well as flow steering support in the latest release (2.13.0g), you're welcome to check them and see if it unblocks our efforts.

aneagoe commented 1 month ago

@davidarinzon @ShayAgros thanks for adding the support. I was playing around with this, but failed to get it to work. I have managed to get the dkms driver installed in two instances in ap-northeast-1, however n-tuple filtering shows off [fixed] on both (c7g, c7a). Any idea why that would be?

[root@jtestaz403 ~]# ethtool -i eth2 | grep ^vers
version: 2.13.0g
[root@jtestaz403 ~]# ethtool -k eth2 | grep tuple
ntuple-filters: off [fixed]
[root@jtestaz403 ~]# modinfo ena
filename:       /lib/modules/5.14.0-508.el9.x86_64/extra/ena.ko.xz
version:        2.13.0g
license:        GPL
description:    Elastic Network Adapter (ENA)
author:         Amazon.com, Inc. or its affiliates
rhelversion:    9.6
srcversion:     2166FB2E66C072B0B42093C
alias:          pci:v00001D0Fd0000EC21sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd0000EC20sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00001EC2sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00000EC2sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00000051sv*sd*bc*sc*i*
depends:        
retpoline:      Y
name:           ena
vermagic:       5.14.0-508.el9.x86_64 SMP preempt mod_unload modversions
sig_id:         PKCS#7
signer:         DKMS module signing key
sig_key:        2B:E2:57:22:EF:30:91:AA:B1:F2:E0:53:8C:14:6D:66:AE:C7:1A:45
sig_hashalgo:   sha512
signature:      0C:FE:F9:14:1C:46:ED:5F:47:17:85:8F:45:B8:AA:CE:91:DB:AF:50:
                10:0D:F2:C5:B8:57:8D:97:B1:9C:AF:73:CD:53:59:D9:E8:EA:7C:F0:
                F2:D1:F1:4B:C3:B4:CE:5B:E0:A6:28:E7:9E:5C:1C:90:A0:55:04:13:
                7A:1A:2D:15:E2:A4:6E:4A:38:8D:7C:16:B2:71:71:CF:4F:AB:73:0B:
                E4:4C:7E:F7:E9:7C:31:12:A9:BA:32:61:98:31:C7:95:9B:62:85:AF:
                EF:5A:4D:2A:44:2C:0D:15:B8:AA:AF:10:B4:AE:57:37:94:D4:6E:9E:
                D8:25:D3:6B:64:27:7F:36:2E:4B:3D:22:29:64:4E:94:D7:70:88:8A:
                1D:42:B3:CE:EA:8D:5A:8E:94:56:72:07:6B:15:18:46:E1:3B:A3:9A:
                44:87:96:D7:2C:9F:39:B1:F2:B7:17:4F:DF:85:AD:82:AE:A6:F4:C6:
                BE:74:DB:2F:D8:97:91:6D:BC:C9:AD:E1:26:86:5C:20:56:04:D8:58:
                1F:99:81:3F:76:F6:5D:AD:9E:C2:FC:37:57:F7:F1:7E:E1:3B:92:32:
                36:D2:3D:47:97:81:B8:9C:67:56:04:03:BE:3D:F0:05:6C:91:A9:03:
                A2:4E:A1:17:6E:B4:4C:7B:8A:DF:CB:91:8E:00:C3:85
parm:           debug:Debug level (-1=default,0=none,...,16=all) (int)
parm:           rx_queue_size:Rx queue size. The size should be a power of 2. Depending on instance type, max value can be up to 16K
 (int)
parm:           force_large_llq_header:Increases maximum supported header size in LLQ mode to 224 bytes, while reducing the maximum TX queue size by half.
 (int)
parm:           num_io_queues:Sets number of RX/TX queues to allocate to device. The maximum value depends on the device and number of online CPUs.
 (int)
parm:           enable_bql:Enable BQL.
 (int)
parm:           lpc_size:Each local page cache (lpc) holds N * 1024 pages. This parameter sets N which is rounded up to a multiplier of 2. If zero, the page cache is disabled. Max: 32
 (int)

Some dmesg output when driver is loaded:

[    1.739787] ena: loading out-of-tree module taints kernel.
[    1.739806] ena: module verification failed: signature and/or required key missing - tainting kernel
[    1.823706] Key type psk registered
[    1.858825] ACPI: \_SB_.PCI0.GSI1: Enabled at IRQ 36
[    1.859661] ena 0000:00:05.0: Elastic Network Adapter (ENA) v2.13.0g
[    1.859694] ena 0000:00:05.0: enabling device (0010 -> 0012)
[    1.871127] ACPI: \_SB_.PCI0.GSI0: Enabled at IRQ 35
[    1.872018] nvme nvme0: pci function 0000:00:04.0
[    1.876460] nvme nvme0: 2/0/0 default/read/poll queues
[    1.877802] ena 0000:00:05.0: ENA device version: 0.10
[    1.877816] ena 0000:00:05.0: ENA controller version: 0.0.1 implementation version 1
[    1.881145] ACPI: \_SB_.PCI0.GSI3: Enabled at IRQ 38
[    1.882050] nvme nvme1: pci function 0000:00:1f.0
[    1.885244]  nvme0n1: p1 p2 p3
[    1.886860] nvme nvme1: 2/0/0 default/read/poll queues
[    1.972221] ena 0000:00:05.0: Forcing large headers and decreasing maximum TX queue size to 512
[    1.972241] ena 0000:00:05.0: ENA Large LLQ is enabled
[    2.068320] ena 0000:00:05.0: Elastic Network Adapter (ENA) found at mem 80308000, mac addr 06:b6:ce:6d:73:1d
[    2.071045] ena 0000:00:06.0: enabling device (0010 -> 0012)
[    2.086978] ena 0000:00:06.0: ENA device version: 0.10
[    2.086991] ena 0000:00:06.0: ENA controller version: 0.0.1 implementation version 1
[    2.203858] ena 0000:00:06.0: Forcing large headers and decreasing maximum TX queue size to 512
[    2.203875] ena 0000:00:06.0: ENA Large LLQ is enabled
[    2.213890] ena 0000:00:06.0: Elastic Network Adapter (ENA) found at mem 8030c000, mac addr 06:b4:f4:35:32:83
[    2.216459] ena 0000:00:07.0: enabling device (0010 -> 0012)
[    2.224039] ena 0000:00:07.0: ENA device version: 0.10
[    2.224050] ena 0000:00:07.0: ENA controller version: 0.0.1 implementation version 1
[    2.322214] ena 0000:00:07.0: Forcing large headers and decreasing maximum TX queue size to 512
[    2.322227] ena 0000:00:07.0: ENA Large LLQ is enabled
[    2.333640] ena 0000:00:07.0: Elastic Network Adapter (ENA) found at mem 80310000, mac addr 06:bd:64:5c:05:f3
davidarinzon commented 1 month ago

Hi @aneagoe

It is mentioned that the feature is supported from 7th generation instance types in the documentation (https://github.com/amzn/amzn-drivers/tree/master/kernel/linux/ena#flow-steering-ntuple). This translates to Nitro v5 (https://docs.aws.amazon.com/ec2/latest/instancetypes/ec2-nitro-instances.html).

Can you please try it on c7gn or r8g and see if it works for you?

We will improve the documentation to better correlate the instance type and nitro families.

aneagoe commented 1 month ago

@davidarinzon thanks, now I understand. I will check on an hpc7g instance in that case. A side question, is there a possibility that c7a or c7g instances will be migrated to Nitro v5 or that this feature would trickle down to Nitro v4? Otherwise, it feels a bit restrictive atm...

davidarinzon commented 1 month ago

Hi @aneagoe

The feature is available from Nitro v5 and on.

You may also use the recently made available c8g and m8g (https://aws.amazon.com/blogs/aws/run-your-compute-intensive-and-general-purpose-workloads-sustainably-with-the-new-amazon-ec2-c8g-m8g-instances/).

davidarinzon commented 4 weeks ago

Resolving this ticket, please feel free to re-open it in case there are questions or issues.