p4lang / behavioral-model

The reference P4 software switch
Apache License 2.0
531 stars 327 forks source link

Approximate conversion of bmv2 p4 performance to hardware performance #1234

Open RithvikChuppala opened 3 months ago

RithvikChuppala commented 3 months ago

I know that bmv2's performance is not production-grade and there are a lot of hardware-dependent factors but is there some approximate performance conversion factor or methodology to gauge the relative performance of the packet programs from bmv2's software switch to a production hardware switch? Something like clock cycles, CPU utilization, etc?

jafingerhut commented 3 months ago

It really depends upon the production switch you have in mind.

For example, Tofino's hardware architecture is such that at a basic introductory level you can say it has the following performance model:

Now of course you can get more nuanced than that, by allowing P4 programs that explicitly recirculate packets, and have other operating points like this:

There are other hardware architectures where the performance will degrade more gradually than that, if you go "a little bit over" the budget of what can be done at X billion packets per second.

Some will have caches between the packet processing core and DRAM, and then cache hit rates play a huge part in the throughput and latency.

Sorry I can't give you a more specific answer, but if you dive at least a bit into two different-enough hardware architectures, you will start to see more of the reasons that "it depends" is the correct answer.

RithvikChuppala commented 3 months ago

Thanks for the quick reply!

For my use case, I'm implementing packet processing functionality to perform tunneling (stripping tunnel headers, adding new egress headers, etc). I aim to show that executing this packet process functionality in a programmable switch improves throughput and latency metrics compared to the normal software-based approach.

However, since bmv2 isn't an accurate representation of performance, what proxy metric for ideal hardware performance do you think makes the most sense?

jafingerhut commented 3 months ago

If you can, the truly best measure is to implement it and measure the relevant performance metrics on a real hardware device.

If for some reason that is not possible, then the next best thing is to learn about some hardware device in enough detail that you can make a good educated guess what the performance metrics would be.