lukego / blog

Luke Gorrie's blog
565 stars 11 forks source link

Celebrating Intel and Mellanox for their open driver interfaces #22

Open lukego opened 6 years ago

lukego commented 6 years ago

Intel and Mellanox are leading the industry by openly documenting how to write device drivers for their network cards. Here we are taking a moment to appreciate the fine work that people at these companies have done to make the networking world a better place.

(This follows from our FOSDEM'18 talk where we were asked a question about how the community can encourage more vendors to publish their specifications. I am thinking a lot about this now! I figure that a good start is to celebrate the companies who are already doing this.)

Intel

The best thing about Intel NICs is that they publish extremely thorough documentation on their website for everybody to see. Everybody has complete access to the same information as Intel's own engineers. This permits independent developers to build up our own confident mastery of the hardware.

I don't know exactly when and why Intel decided to publish this information but I am very grateful that they did. Snabb and many other projects were made possible because Intel had already published complete specifications for their hardware at a time when nobody else in the industry did so.

The highest compliment that I can pay to Intel is to say that I wrote several drivers for their NICs without them even knowing that I exist. I never contacted them because I could always solve my problems using the documentation provided. This makes their hardware very attractive to anyone who wants to be able to solve problems themselves without depending on vendor support organizations ("hardcore debugging heaven" vs "conference call hell"?)

Intel are also very explicit about the performance characteristics of their cards. 10G cards should do line-rate with 64B packets, 40G cards with 128B packets, and 100G cards with 256B packets. If you reach this level then you know that your driver is working correctly. If you don't then you know there is a problem that you can fix to improve performance. Being able to confidently reason about hardware performance is absolutely priceless.

Here is a list with some of Intel's famous NIC data sheets. These are the gold standard for describing the interface between a host and a network card:

Each data sheet is about one thousand pages long and completely describes the driver interface for one ethernet controller including all optional features (as far as I know.) Some of the specifications are very similar and others less so. Intel maintain many separate drivers to cover this family (igb, ixgb, igbvf, i40e, i40evf, fm10k) while in Snabb we are incrementally adding support for all cards in a single unified driver (intel_mp.lua).

(Aside: I would love to work with Intel on defining a "lite" driver interface that could make a simple driver work consistently across all cards. If you work at Intel and like that idea then drop me a line and let's make that happen!)

Mellanox

Mellanox recently worked with Snabb and Deutsche Telekom to make their ConnectX network cards completely accessible to independent driver developers everywhere. I am very impressed to see how quickly and decisively Mellanox acted once they appreciated the position of independent developers who are striving to create self-sufficient applications.

(Special credit is also due to Normen Kowalewski, Rainer Schatzmayer, and their colleagues at Deutsche Telekom for demonstrating that the needs of small independent developers are also closely aligned with the needs of large network operators. The small fish, the big fish, and the vendors are all working in the same ecosystem and we are mutually invested in each others' success.)

The best thing about Mellanox NICs is that they define a consistent driver interface for all of their ConnectX products. The same driver can be used for 1G/10G/25G/40G/50G/100G and for ConnectX-4 and ConnectX-5 and future families. This simple design choice is amazing for application developers. We only have to develop one device driver and so we save a tremendous amount of effort. We can also bring support for new hardware to the market much more quickly by building on the support that we already have. (Great job, Mellanox designers! :+1:)

The driver interface is specified in the Programming Reference Manual (PRM) and this can be used to write a short and sweet driver that is completely independent of the kernel, OFED, and DPDK.

The ConnectX card has more features beyond those described in the public edition of the PRM. The public subset does however include everything that we need for general purpose packet forwarding applications like Snabb. It also includes several details that I find especially clever and I will take this opportunity to appreciate one of them.

UARs

User Access Regions (UARs) are a simple and practical mechanism for sharing one NIC between many unprivileged applications. It accomplishes this without depending on heavy-weight hardware features like SR-IOV and the IOMMU. The model is to simply place all of the registers for a given transmit or receive queue on a dedicated 4KB page of address space. Getting this page mapped into your process then serves as a capability to perform I/O on the queue: if you have the mapping you can do I/O and if you don't then you can't.

The design allows a privileged driver (e.g. kernel) to securely delegate direct DMA I/O access to specific queues to other smaller drivers running in unprivileged processes. This makes it easy to share the NIC between multiple independent applications and it only requires one process to have a "real" driver for initialization and provisioning.

This model provides options to application developers. You can write a complete driver from the ground up (that's what we did for Snabb) or you can write a thin driver that accepts queues delegated by the kernel (that's how DPDK and OFED work.) This is excellent -- the best compromise is the one that you get to make yourself.

Check out the PRM for more details!