amzn / amzn-drivers

Official AWS drivers repository for Elastic Network Adapter (ENA) and Elastic Fabric Adapter (EFA)
453 stars 174 forks source link

[Bug]: Does not build on Kernel 6.10 #313

Open arianvp opened 1 month ago

arianvp commented 1 month ago

Preliminary Actions

Driver Type

Linux kernel driver for Elastic Network Adapter (ENA)

Driver Tag/Commit

ena_linux_2.12.3

Custom Code

No

OS Platform and Distribution

Linux 6.10.1

Bug description

The driver does not compile on Kernel 6.10.

This seems to be caused by the dropping of the __napi_alloc_skb function https://github.com/torvalds/linux/commit/6e9b01909a811555ff3326cf80a5847169c57806

Reproduction steps

Build the driver with the 6.10 kernel

Expected Behavior

it compiles

Actual Behavior

It doesn't compile

Additional Data

No response

Relevant log output

make -C /nix/store/2yg16mg5cgcckxl0hjzvhz7sw3yrkwjm-linux-6.10.1-dev/lib/modules/6.10.1/build M=/build/source/kernel/linux/ena modules
make[1]: Entering directory '/nix/store/2yg16mg5cgcckxl0hjzvhz7sw3yrkwjm-linux-6.10.1-dev/lib/modules/6.10.1/build'
  CC [M]  /build/source/kernel/linux/ena/ena_netdev.o
  CC [M]  /build/source/kernel/linux/ena/ena_ethtool.o
  CC [M]  /build/source/kernel/linux/ena/ena_lpc.o
  CC [M]  /build/source/kernel/linux/ena/ena_phc.o
  CC [M]  /build/source/kernel/linux/ena/ena_xdp.o
/build/source/kernel/linux/ena/ena_xdp.c: In function 'ena_xdp_rx_skb_zc':
/build/source/kernel/linux/ena/ena_xdp.c:749:15: error: implicit declaration of function '__napi_alloc_skb'; did you mean 'napi_alloc_skb'? [-Werror=implicit-function-declaration]
  749 |         skb = __napi_alloc_skb(rx_ring->napi,
      |               ^~~~~~~~~~~~~~~~
      |               napi_alloc_skb
compilation terminated due to -Wfatal-errors.
cc1: some warnings being treated as errors

Contact Details

No response

davidarinzon commented 1 month ago

Hi @arianvp Thank you for raising this issue, we are aware of it, and it will be addressed in an upcoming release of the driver. We will provide a slightly more generic solution that will resolve this issue for future OS distributions.

As you have proposed a patch, I assume that the issue is not blocking you from continuing your AF_XDP development.

arianvp commented 1 month ago

Nope all good. The patch unblocks me for now. Thanks

arianvp commented 1 month ago

Ah after my patch I run into the next compiler error:

error: builder for '/nix/store/lxmbs6ya60x68wf5a8mpk5njg3f0xrwh-ena-2.12.3-6.10.2.drv' failed with exit code 2;
       last 10 log lines:
       > /build/source/kernel/linux/ena/ena_xdp.c: In function 'ena_xdp_clean_rx_irq_zc':
       > /build/source/kernel/linux/ena/ena_xdp.c:816:17: error: too many arguments to function 'xsk_buff_dma_sync_for_cpu'
       >   816 |                 xsk_buff_dma_sync_for_cpu(xdp, rx_ring->xsk_pool);
       >       |                 ^~~~~~~~~~~~~~~~~~~~~~~~~
       > compilation terminated due to -Wfatal-errors.
       > make[3]: *** [/nix/store/216k8yzq05s5r3lvnh7jd9hbric4cmln-linux-libre-6.10.2-dev/lib/modules/6.10.2-gnu/source/scripts/Makefile.build:244: /build/source/kernel/linux/ena/ena_xdp.o] Error 1
       > make[2]: *** [/nix/store/216k8yzq05s5r3lvnh7jd9hbric4cmln-linux-libre-6.10.2-dev/lib/modules/6.10.2-gnu/source/Makefile:1934: /build/source/kernel/linux/ena] Error 2
       > make[1]: *** [/nix/store/216k8yzq05s5r3lvnh7jd9hbric4cmln-linux-libre-6.10.2-dev/lib/modules/6.10.2-gnu/source/Makefile:240: __sub-make] Error 2
       > make[1]: Leaving directory '/nix/store/216k8yzq05s5r3lvnh7jd9hbric4cmln-linux-libre-6.10.2-dev/lib/modules/6.10.2-gnu/build'
       > make: *** [Makefile:102: ena.ko] Error 2
       For full logs, run 'nix log /nix/store/lxmbs6ya60x68wf5a8mpk5njg3f0xrwh-ena-2.12.3-6.10.2.drv'.
error: builder for '/nix/store/j23sjz0wq8gq3y9p8a6082r0nvg9qnyi-ena-2.12.3-6.10.2.drv' failed with exit code 2;
       last 10 log lines:
       > /build/source/kernel/linux/ena/ena_xdp.c: In function 'ena_xdp_clean_rx_irq_zc':
       > /build/source/kernel/linux/ena/ena_xdp.c:816:17: error: too many arguments to function 'xsk_buff_dma_sync_for_cpu'
       >   816 |                 xsk_buff_dma_sync_for_cpu(xdp, rx_ring->xsk_pool);
       >       |                 ^~~~~~~~~~~~~~~~~~~~~~~~~
       > compilation terminated due to -Wfatal-errors.
       > make[3]: *** [/nix/store/4pi3spm0m1w4hyr659wkpxmf9cj7b6zr-linux-6.10.2-dev/lib/modules/6.10.2/source/scripts/Makefile.build:244: /build/source/kernel/linux/ena/ena_xdp.o] Error 1
       > make[2]: *** [/nix/store/4pi3spm0m1w4hyr659wkpxmf9cj7b6zr-linux-6.10.2-dev/lib/modules/6.10.2/source/Makefile:1934: /build/source/kernel/linux/ena] Error 2
       > make[1]: *** [/nix/store/4pi3spm0m1w4hyr659wkpxmf9cj7b6zr-linux-6.10.2-dev/lib/modules/6.10.2/source/Makefile:240: __sub-make] Error 2
       > make[1]: Leaving directory '/nix/store/4pi3spm0m1w4hyr659wkpxmf9cj7b6zr-linux-6.10.2-dev/lib/modules/6.10.2/build'
       > make: *** [Makefile:102: ena.ko] Error 2
       For full logs, run 'nix log /nix/store/j23sjz0wq8gq3y9p8a6082r0nvg9qnyi-ena-2.12.3-6.10.2.drv'.
akiyano commented 1 month ago

Hi @arianvp, Please try this patch 0001-workaround-patch-for-kernel-6.10.patch

lano1106 commented 1 month ago

I am just curious. What is the benefit or the reason to use the github version of the driver vs the one that is in intree?

are they different?

the one in drivers/net/ethernet/amazon/ena works out of the box with 6.10.2 for me...

akiyano commented 1 month ago

Hi @lano1106,

The driver that is upstream works perfectly well. But if you wish to have the latest features available in ENA, they are first pushed to the github driver and arrive in the upstream driver later on.

lano1106 commented 1 month ago

oh cool... How can someone know what is in github version not yet in tree?

the commit history?

davidarinzon commented 1 month ago

oh cool... How can someone know what is in github version not yet in tree?

the commit history?

Hi @lano1106 There's no way to correlate between those, as the upstreaming process and the linux kernel requirements are different. Usually, ENA driver changes are pushed in a bundle of commits to the upstream kernel, you can look at https://github.com/torvalds/linux/tree/master/drivers/net/ethernet/amazon/ena commit history. There's no versioning for driver in upstream, therefore, the tracking would be manual.

JonKohler commented 1 week ago

Heads up @arianvp - more breakage inbound for 6.11 in case you're looking to go up to date: https://github.com/amzn/amzn-drivers/issues/321

davidarinzon commented 1 day ago

Hi @arianvp

This issue is expected to be resolved with https://github.com/amzn/amzn-drivers/releases/tag/ena_linux_2.13.0 Please let us know if you experience more issues.