Open cyanide-burnout opened 4 years ago
Same problem for me. Fixed it in ixgbe.c, these instructions must be applied for both the RX and TX structures:
MAX_RX_QUEUE_ENTRIES
through a #define
rather than a const int
*virtual addresses[]
field is now void* virtual_addresses[MAX_RX_QUEUE_ENTRIES]
+ sizeof(void*) * ...
partThere are also other ways to solve this. However, this is probably the most immediate way. BTW, shout-out to the authors for the nice work!
Yes, this patch works very well. And by using it we reached performance of DPDK‘s ixgbe twice with less CPU resources.
And by using it we reached performance of DPDK‘s ixgbe twice with less CPU resources.
That seems unlikely, can you elaborate on the exact comparison? The batch from the comment above basically turns a flexible array member into a fixed-size array which, since we don't have any bounds checks anyways, should not make a difference for performance...
The only thing that I can think of is false sharing, but
After applying the fix i rewrote my code. Now transmits in multiple threads. 2-4 queues per thread. It’s production system, which can use different methods to accelerate transmission/reception of UDP. Since I wrote several backends to provide transmission by using optimised sockets, PACKET_MMAP, XDP, DPDK and Ixy, we did some bench tests. @stefansaraev did testing, so i would like to ask him to publish results. But anyway the performance of ixy is quite surprised, probably due to buggly implementation of ixgbe by Intel and unnecessarily complicated code. At least I have found some bugs in the code of their implementation of XDP (https://github.com/xdp-project/xdp-tutorial/issues/273).
Your bug in two words: wrong interpretation of array access. Just imagine, which address has, for example, rx_queue[1]
since the size of struct ixgbe_rx_queue
doesn't include actual length of virtual_addresses[]
and virtual_addresses
is not a pointer to array somewhere else. Typical overlapping, where rx_queue[1]
overlaps rx_queue[0].virtual_addresses
.
I've fixed my comment. Correct explanation is there. 01:30 AM here, I have to be in bed ;)
We have issue when we are using than 1 queue on ixgbe
Stack trace: 1 pkt_buf_free 2 ixgbe_tx_batch
Both queues are processed in single thread, we tried to use single mempool as well as mempool-per-queue. The result is the same - any other queue except #0 caught.