Xilinx-CNS / onload

OpenOnload high performance user-level network stack
Other
525 stars 86 forks source link

eflatency crash with AF_XDP on #170

Open wdebruij opened 10 months ago

wdebruij commented 10 months ago

commit 150c1507e354 ("ON-11184: add support for generic network devices with AF_XDP - PoC") removed special src/tests/ef_vi logic for AF_XDP.

First a small nit: the commit removed MODE_XDP ("-m x"), but the mode is still reported in usage.

The main issue we're running into is a NULL pointer dereference. Stack trace:

efxdp_vi_mmap_bytes
ef_vi_xdp_init_qs
ef_vi_init_qs
__ef_vi_alloc
ef_vi_alloc_from_pd
do_init
main

vi->evq_base is NULL at this point.

It is initialized from qmem, from mem_mmap_ptr in __ef_vi_alloc.

This pointer is only initialized if( ra.u.vi_out.mem_mmap_bytes ).

This variable is received from the kernel with ioctl CI_RESOURCE_ALLOC. It appears to be zero for AF_XDP.

Is this perhaps because AF_XDP uses host buffers rather than device buffers, allocated in deferred_vis -> af_xdp_init?

I have not fully traced the kernel path as a result of this ioctl yet, and where AF_XDP initialization fits in.

sianj-xilinx commented 10 months ago

As you note we (incompletely) removed AF_XDP logic for ef_vi. We don't currently have plans to support ef_vi with AF_XDP as we didn't expect this to be that useful compared to onload support. I would expect that improving our onload AF_XDP support to product quality is likely to remain a higher priority than extending the support to include ef_vi. Let's call the eflatency crash when you try and use it a reflection of the alpha quality of our AF_XDP support at the moment. We will need to avoid such ungraceful failure though as we improve our AF_XDP support to product quality.

wdebruij commented 10 months ago

Thanks for the clarification. That makes total sense. Then I won't spend time to make it work either.

EF_VI support with AF_XDP would be a nice to have only to have a level to run (performance) tests at in between TCP/IP on Onload and raw AF_XDP sockets. But agreed that AF_XDP related efforts are better directed towards the Onload TCP/IP stack.