Open lukego opened 6 years ago
Also, the exact effects of DDIO are just a big mystery. All we know is that there's 10% of the cache that is somehow special/reserved for DMA, based on something presumably similar to CAT. But the lack of PMU counters and documentation make it really hard to say how exactly that feature can be modeled.
Perhaps @qmonnet could comment on the feasibility of implementing easynic on NFP...
Hi there, implementing this on one of our smartNICs seems completely doable to us (/Cc @edwin-peer @kuba-moo). Apparently we have some firmware code not too far from this, but I don't believe any of it is publicly available at this time.
Is this just a theoretical question, or is someone actually considering a NFP implementation?
@qmonnet Are you coming to FOSDEM? I'll give a talk on the SDN track about NIC host-device interfaces using ConnectX as a case-study. Just after that might be a good time for like-minded people to meet up for a coffee and compare notes?
This is a very practical matter for Snabb hackers. There is no "hackers' favourite" NIC on the market for us to use anymore. Back in the 10G era we had the Intel Niantic/82599 but in the 25G-100G era it's a mess. Somebody needs to produce the new hackers' favourite and it's not clear who will do that yet.
I'll explain the long version in my talk in Belgium and also highlight the positive aspects of ConnectX. (We previously worked with Mellanox to get the ConnectX documentation published without NDA and then we used that to write our own device driver for Snabb. ConnectX has a nice consistent host-device interface. However, ConnectX has some limitations in silicon that are deal-breakers for many applications, see deep dives at snabbco/snabb#1007 and snabbco/snabb#1013, so I don't see it as the end of the story.)
I am sure that many NICs are being passed over for projects due to secret and/or lousy host-device interfaces. This has absolutely been the case on the Deutsche Telekom TeraStream project but also in many less visible projects.
[Sorry for the deiay...] Thanks for those details. And yes, I'll be at FOSDEM, I also have a presentation in that devroom. Looking forwards to catching up with you in February, then :).
@qmonnet I'm thinking a lot about the proposition of vendors making their hardware able to support the EasyNIC interface.
Like: Is this going to benefit the vendor? Will it help them sell hardware? Will it make support easier or harder?
And: How should a vendor do this? Implement EasyNIC themselves? Or release firmware-development documentation to help the community to it themselves? Or just open source their existing firmware with no support and hope for an "OpenWRT" situation where the community fork and extend it?
I don't know the answer. I think it would be really awesome to be able to buy Netronome NICs and load EasyNIC-compatible firmware onto them but I don't know whether supporting that is worth the trouble for Netronome (or whether the existing hardware/firmware makes a good substrate for this anyway.)
Just now the idea that's captured my imagination is to implement EasyNIC in Verilog starting with #15 and start with 10G/PCIe2 on FPGA. But it would be a long road for this to catch up with leading-edge NICs which are already moving towards PCIe4 and 200GbE and won't stop there. So if there were a good way to piggy-back on vendor silicon like the Netronome that might be much better.
@qmonnet One other reflection after FOSDEM (sorry we had so little time to chat!) is that it would not make sense for Intel or Mellanox to support EasyNIC. I have the strong impression that these companies are not in the NIC business at all. Their real goal is to lock users into a complex software ecosystem that they control (starting with DPDK and moving up from there into application-specific vertical integrations) and for this purpose it is important to make NICs as complex as possible to prevent independent developers from making things simpler.
I know they wouldn't choose these words but it's a model that fits their actions very well.
IMHO I see a few potential ways to draw market attention to EasyNIC:
Get an OS community to support it. It's probably easier to get the BSD community excited about this vs. Linux as BSD dev culture is a lot more concerned with openness & code correctness, & simplicity. (again, IMHO...) There are plenty of Linux people in the that are concerned about openness, & others concerned w/code correctness, but I've met very few Linux folks that champion both the way most BSD people usually do. An open-source Windows driver is not out of the question too. Network-focused Linux distros e.g. OpenWRT might be more receptive vs. general Linux community.
Get a cloud provider to support it. If using this is as easy as using e.g. AWS/GCP/VMware virtual NICs & hosting shops start integrating it then you're in pretty good shape.
Get a system integrator to support it. Anything from the Adafruits & Raspberry Pis of the world up to the big guys Lenovo/HP/Dell etc. Probably best to concentrate on corps either already present in the open networking space, or open-source & builder friendly companies.
Just my $0.02...
It's probably easier to get the *BSD community excited about this
This is a really great point!
The push to make drivers simpler is not urgent for projects like Linux or DPDK because the vendors themselves write all the software to paper over their complex hardware. Smaller projects like Snabb and NetBSD and OpenBSD are doing more of the work ourselves and so we have a common need to make things simple and easy to support and maintain.
Hi, so just in case you haven't read it yet on Twitter, Netronome has open-sourced their firmware (including WIP documentation) yesterday :).
@qmonnet That's huge! Looking forward to checking this out in great detail :)
We will need a realistic and accessible benchmarking setup to validate the design. For example we need to be able to experimentally work out what special accommodations the DMA design needs to make to the CPU regarding alignment etc (see #9).
How to do it? Here are a few ideas from hardest / most realistic downwards:
The last one seems very convenient. Has any research been done (e.g. to the PMU level) about how well L3 cache (e.g. array too large to fit into L2) works as a proxy for freshly DMA'd data on x86? (Maybe you guys have looked at this @emmericp?)