Closed smolkaj closed 2 years ago
Please find a small program reproducing the issue attached. issue_2875.p4.txt
To elaborate on my initial report, I believe that the issue is that p4c introduces new temporaries, causing BMv2 to preserve only those temporaries (but not the original fields) when cloning with clone3
. This will brake code that accesses the original fields in the clone, since they will be 0.
This is actually a fundamental bug in v1model: https://github.com/p4lang/p4c/issues/1514 The semantics of P4-16 in fact does not allow such operations to be implemented at all. The fact that they worked was an accident of the implementation. The compiler should give warnings if you attempt to use them. We have attempted to fix them in various ways, by changing the architecture definition (e.g., https://github.com/p4lang/p4c/pull/1698), but in the end the agreement is that the solution is to migrate to PSA.
But, of course, PSA is not ready for consumption yet, even two years later.
Thanks for the explanation and pointers, Mihai.
Do you have any suggestions what we might do in the meantime? Is there some kind of (imperfect) workaround, either in the program or one we could easily implement in the compiler?
The fact that they worked was an accident of the implementation.
How hard would it be keep this accident around for the moment? Would that be feasible?
Then you just need to revert the patch you pointed to. I don't know how many patches after that depend on those changes.
The fact that they worked was an accident of the implementation.
How hard would it be keep this accident around for the moment? Would that be feasible?
This is actually very unpleasant. This is really a fundamental bug in the design of v1model which we didn't catch early enough. I re-read the solution proposed in https://github.com/p4lang/p4c/pull/1698, and I think it is actually a decent solution. But it certainly it is not backwards compatible. If you need the functionality of clone3/resubmit/recirculate and you are willing to rewrite some of your programs we could do the following:
In fact, I do not see many downsides to doing this. The design committee decided not to fix this bug on the assumption that PSA would take over v1model. But perhaps we should no longer wait for PSA and do this instead.
To summarize the solution proposed in #1698, you would annotate metadata fields that you may want to preserve with annotations like @field_list(1)
. Then in the recirculate/resubmit/clone calls the third argument is not the list of fields but just the number you used in the annotation (1). That's all you need to do.
For the fix we would create new functions recirculate_list/resubmit_list/clone3_list functions in v1model.
It would be great to discuss this at the September LDWG meeting if Steffen can attend that.
Thanks for all the background info and suggestions, I really appreciate it @mbudiu-vmw.
My only worry is that we would need a solution that works with mainline p4c, since forking/using an outdated version of p4c is not a viable long term strategy for us.
@jnfoster, yes, it would be great to discuss this at the LDWG metting. I'm planning to attend more regularly again going forward. Currently the next meeting is planned for Labor Day -- I am assuming it will be reschedule to not be on a holiday?
I don't see many downsides adding this extension to the standard p4c-bm2-ss and to v1model.
For future reference, the current state of affairs is also documented here: https://github.com/p4lang/behavioral-model/blob/main/docs/simple_switch.md#restrictions-on-recirculate-resubmit-and-clone-operations
Also for future reference, I think one can use clone
+ recirculate({})
(+ extra bookkeeping) to implement clone3
.
Maybe there is a nice theorem to be proven here.
https://github.com/p4lang/p4c/pull/2902 is an attempt to address this issue again. Please comment on the PR if you find it acceptable. Perhaps we should not change the signature of existing functions in v1model, but add new ones.
I believe https://github.com/p4lang/p4c/pull/2781 broke the clone3 primitive.
E.g., consider the following code:
BMv2 config before:
After:
I will try to make this more reproducible, but I was hoping that an expert like @mbudiu-vmw or @antoninbas might be able to see if there is an issue or not even with what I have shared so far.