lukego / easynic

EasyNIC: an easy-to-use host interface for network cards
43 stars 0 forks source link

Avoid individual descriptors #4

Open lukego opened 6 years ago

lukego commented 6 years ago

One possible consequence of optimizing PCIe efficiency (#3) is to avoid using individual descriptors for each packet that is transmitted and received.

If consecutive packets are streamed to and from large memory buffers then two bytes of per-packet metadata may be sufficient e.g. to indicate the packet length and the Ethernet FCS validity.

This would be considerably more streamlined than the Intel and Mellanox approaches that typically require between 16 and 48 bytes of metadata for each packet. These scatter-gather designs are burdened with transferring the 64-bit address of each packet's individual buffer(s) and often with other non-essential metadata.

emmericp commented 6 years ago

Did you have a look at the old NetFPGA DMA engine?

I believe it worked similar to what you are suggesting (for receiving packets), but I don't recall details as it has been a few years since I did something with that.

lukego commented 6 years ago

I had a quick look at the 1G NetFPGA. My impression is that they have a separate ASIC handling the DMA descriptors ("CPCI chip") and it pushes/pulls 32-bits of data to the NetFPGA on each cycle without revealing which memory it represents. So the host looks like a "parallel port" to the FPGA.

Is that right? Is there a better place to look then the netfpga/netfpga repo?

Compatibility with off-the-shelf IP is potentially interesting/tricky. I'm curious what (if any) constraints the Xilinx DMA engine puts on DMA in terms of descriptor formats, etc. Can be some tension between simplicity of interface (avoid unnecessary complexity) and simplicity of implementation (reusing IP.)

emmericp commented 6 years ago

I was specificly referring to the DMA core on the NetFPGA 10G, the newer SUME works more like your typical NIC. No idea about the old 1G NIC.

I've only done some very high-level work with that NIC (capturing packets with OSNT) without ever touching internals. But the framework basically directly mapped the DMA region into user space which was just a large ring buffer with back-to-back packets IIRC.

The relevant driver for the reference nic is here: https://github.com/NetFPGA-NewNIC/linux-driver