luigirizzo / netmap

Automatically exported from code.google.com/p/netmap
BSD 2-Clause "Simplified" License
1.86k stars 537 forks source link

nm_open with range of queues #941

Closed cygnus2048 closed 9 months ago

cygnus2048 commented 11 months ago

Imagine I have 8TX and 8RX queues and I want to process them with 2TX and 2RX threads. The first thread should handle queues 0-3 and the second thread queues 4-7. It doesn't look like nm_open() will allow this with the current syntax, but can I just create my own version of nm_open() to populate the nm_desc structure accordingly with appropriate first_tx_ring and last_tx_ring (same for RX) values?

In essence, if nm_open supported a range, I would like to do this... thread1_nm_d = nm_open("netmap:ens4-0-3", ...); thread2_nm_d = nm_open("netmap:ens4-4-7", ...);

Thanks.

vmaffione commented 11 months ago

Hi, It would not be enough to modify nm_open or nmport_open to support such a syntax. In the very end, netmap open requests are submitted to the kernel by means of the struct nmeq_register, and there is no way to setup that struct to specify a range of rings. Furthermore, there is no kernel support for ranges (see netmap_interp_ringid).

However, you could achieve your goal (and potentially more efficiently than using ranges) with the following scheme:

cygnus2048 commented 11 months ago

I see. However, I guess I'm not understanding as much as I thought since what you say is not possible seems to be exactly what is done now when I open the device in "all queue" mode using no suffix as "netmap:ens4"? That is a single fd with multiple rings as well as a single ring_id and it gets access to all the queues. The descriptor has a "first ring" and "last ring" elements for TX and RX so with this in mind, how is ring_id even used?

My concern using multiple fd's is that in my benchmarking, most of the time is spent in the NIOCTXSYNC and NIOCRXSYNC ioctl() calls and now I will need to have my thread loop those calls over multiple fd's. Hopefully the total time spent will be roughly the same as in the end, it will be the same number of packets being processed, but I don't know this for a fact.

Also, is there any benefit to having a separate fd for TX and RX? I currently use a single one for all queues that the TX and RX threads both use and I have not seen any contention. In other words, using this as opposed to your suggestion above...

Thanks.

vmaffione commented 11 months ago

Maybe the misunderstanding is about the "first ring" and "last ring" fields within the userspace library (libnetmap or netmap_user.h structs. Those are just helpers that allow the programmer to write more general programs, by scanning all the rings that are bound to a given netmap file descriptor. While those fields may induce you to think that they can assume any range of values, however, only some combinations are supported by the kernel.

Things you can do:

  1. open all the hardware rings (TX+RX, or TX only or RX only)
  2. open only the host rings (TX+RX, or TX only or RX only)
  3. open both hardware and host rings (TX+RX, or TX only or RX only)
  4. open just a single ring (TX+RX, or TX only or RX only) See the NR_REG_* macros in sys/net/netmap.h.

Having different fds for different threads is useful to avoid unnecessary wakeups, and to selectively sync only a subset of the rings. If open netmap:ens4, and have multiple threads using the same fd but accessing different rings, any TX or RX interrupt will wake up all the threads polling that fd, even if they only care about TX or RX wakeups. Also keep in mind that by using poll() a single thread can sync multiple file descriptors with a single system call.

cygnus2048 commented 9 months ago

Sorry, I forgot to reply, but thanks for your explanation and information. I was able to open fd's for each TX and RX queue and just have my threads loop over them when I have less cores than queues. Thanks again. Closing this.