p4lang / p4runtime

Specification documents for the P4Runtime control-plane API
Apache License 2.0
142 stars 88 forks source link

Add per color counter data for direct meters. #370

Closed kishanps closed 2 years ago

kishanps commented 2 years ago

@jafingerhut @antoninbas This is a first draft of the p4runtime proto changes we discussed here

smolkaj commented 2 years ago

Feedback from @antoninbas: Also add counters for indirect meters.

jfingerh commented 2 years ago

Just a quick note: If your goal is to enhance bmv2 and the P4Runtime API so that it more closely matches a particular model of switch ASIC, then I have no objections. I mean, you can completely rework bmv2 and P4Runtime API so it matches exactly what switch vendor X provides, and has anywhere from minor to big differences with any other switch ASIC in the world. I am not saying that these changes alone are going to push you into huge differences. I am saying that if your goal is cross-switch compatibility, then tweaks like this are potentially nudging you further from that goal, not closer to it. If you want to be sure, you should be checking these kinds of changes with other switch ASICs you may be interested in. If you ignore that, then you could reach a place where you have to significantly rework things for compatibility later (if that is part of the goal -- it may not be).

smolkaj commented 2 years ago

Thanks for raising these important points, @jfingerh.

With PINS, we explicitly want to build a switch stack that is not tied to a particular vendor. So we definitely want to avoid "overfitting" to a particular ASIC accidentally.

I cannot speak to the capabilities of various ASICs, since that is not my expertise. FWIW, the counters we're modeling here are supported by SAI. I take that as an indication that this capability is likely supported by many vendors. This may not be a sound inference? If someone knows more on this subject, please do share.

Also, note that our proposal (not part of this PR yet, to be added) would be that supporting these counters is optional. So from a P4Runtime perspective at least, there should hopefully be no concerns that this change will make it harder to build P4Runtime standard-compliant targets.

This raises a deeper, more general question though (that you may have been alluding to, and that I have been pondering about): how open should we be to adding features to P4Runtime for which support is optional? At one extreme, you could imagine to disallow optional features altogether, and make P4Runtime the intersection of features supported by all vendors.
At the other extreme, you could imagine to liberally allow optional features, and make P4Runtime the union of features supported by all vendors.

The latter extreme makes it really easy to extend the standard and to build standard compliant switches, but also make the standard almost useless because it is so liberal that it makes barely any guarantees. The former extreme is more desirable from a cross-vendor compatibility & testing perspective, but in its extreme form would be so limiting as to barely support any features. Navigating these two extremes is really tricky. Are there any guiding principles that can help us with this, or that maybe the WG has relied on in the past?

kishanps commented 2 years ago

Echo @smolkaj comments, this particular counter is supported by SAI and we would expect a SAI compliant vendor to have this in their ASIC.

kishanps commented 2 years ago

@jfingerh, @antoninbas Gentle reminder.

kishanps commented 2 years ago

Addressed review comments.

kishanps commented 2 years ago

Resolved open comments, PTAL.

antoninbas commented 2 years ago

@kishanps CI is currently failing:

The generated Go files are not up-to-date
You can regenerate them with './codegen/update_go.sh' and commit the changes
kishanps commented 2 years ago

Thanks @antoninbas for pointing out the script, builds passing now.

kishanps commented 2 years ago

Gentle reminder.

jfingerh commented 2 years ago

"The remaining big question is whether returning UNIMPLEMENTED is acceptable (technically a breaking change) or not."

So here is a question I love to ask to people who claim to advocate semantic versioning, to find out how much they actually believe it: Are you willing to bump up the major version number of the P4Runtime spec to 2.0.0 in the next release if this change is included?

If yes, then I have no objections to backwards-incompatible changes at all. Having a brief note in the change log for the revisions explaining what the known backwards-incompatible changes are would be a nice reference, for those deveopers who want a summary of what those backwards-incompatible changes are.

If no, then why not?

(And in case you are wondering why I am even asking about this, the top level README for the P4Runtime specification says this: "We will use semantic versioning to track changes to the P4Runtime specification: major version numbers track API-incompatible changes; minor version numbers track backward-compatible changes; and patch versions make backward-compatible bug fixes.")

kishanps commented 2 years ago

The latest builds are failing, comparing a good and failed run, looks like the environment has changed. The error seems to be happening on a timeout where an input is expected for configuring tzdata, whereas the previous success runs did not setup tzdata at all. @antoninbas Any recent changes on the build side ?

Setting up tzdata (2021e-0ubuntu0.20.04) ...
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
Configuring tzdata
------------------
Please select the geographic area in which you live. Subsequent configuration
questions will narrow this down by presenting a list of cities, representing
the time zones in which they are located.
  1. Africa      4. Australia  7. Atlantic  10. Pacific  13. Etc
  2. America     5. Arctic     8. Europe    11. SystemV
  3. Antarctica  6. Asia       9. Indian    12. US
Geographic area: 
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
antoninbas commented 2 years ago

@kishanps thanks for letting me know, I'll fix it right away. We indeed recently updated the base OS for CI from Ubuntu 16.04 to Ubuntu 20.04, which is causing this.

kishanps commented 2 years ago

@antoninbas I resolved the 2 proto generated files to my best knowledge but my merge commit build failed.

In my local repo, I did a git pull --rebase, bazel build under the proto dir and then codegen/update.sh. It generated a bunch of file changes (even in p4data which I didnt touch in my PR). And most of the changes seem like they were reversing the protoc changes you did in your commit last week (see below). Am I missing something ?

git diff go/p4/v1/p4runtime.pb.go
diff --git a/go/p4/v1/p4runtime.pb.go b/go/p4/v1/p4runtime.pb.go
index 52f850f..6b86f4b 100644
--- a/go/p4/v1/p4runtime.pb.go
+++ b/go/p4/v1/p4runtime.pb.go
@@ -15,7 +15,7 @@
 // Code generated by protoc-gen-go. DO NOT EDIT.
 // versions:
 //     protoc-gen-go v1.26.0
-//     protoc        v3.18.1
+//     protoc        v3.6.1
 // source: p4/v1/p4runtime.proto

git status
    modified:   go.mod
    modified:   go/p4/config/v1/p4info.pb.go
    modified:   go/p4/config/v1/p4types.pb.go
    modified:   go/p4/v1/p4data.pb.go
    modified:   go/p4/v1/p4runtime.pb.go
    modified:   py/p4/config/v1/p4info_pb2.py
    modified:   py/p4/config/v1/p4info_pb2_grpc.py
    modified:   py/p4/config/v1/p4types_pb2.py
    modified:   py/p4/config/v1/p4types_pb2_grpc.py
    modified:   py/p4/v1/p4data_pb2.py
    modified:   py/p4/v1/p4data_pb2_grpc.py
    modified:   py/p4/v1/p4runtime_pb2.py
    modified:   py/p4/v1/p4runtime_pb2_grpc.py
antoninbas commented 2 years ago

@kishanps maybe try doing docker pull third-party:latest first. I'm assuming you still have the old version of the image from before.

smolkaj commented 2 years ago

Looks good from my side. @antoninbas @jafingerhut any other changes needed here?

kishanps commented 2 years ago

Thanks @smolkaj @antoninbas @jfingerh for all the help!

antoninbas commented 2 years ago

@kishanps Thanks for contributing this!