Closed edgar-costa closed 3 years ago
Can you elaborate on what "does not work" mean? Can you supply a complete program and an STF file with input/output packets that exhibits this problem?
Yes, a complete program exhibiting the problem would be helpful, if you are able to share one.
Just a wild guess that the symptoms you describe sound similar to this issue: https://github.com/p4lang/p4c/issues/1479
which I closed because the root cause seems to be this one: https://github.com/p4lang/p4c/issues/1514
Checking the arguments to the resubmit/recirculate/clone/etc. operation in the BMv2 JSON file to see if they are the fields you expect, or perhaps optimized-to-constants values instead (which does not work in BMv2) is one way to identify that the problem is in p4c.
By saying "does not work" I am talking about the main issue. The metadata fields do not get copied. Something that did work 3-4 months ago.
I will update the program with the problem so to be fast (but i did not include any STF file..)
If you run it with a bmv2 with debugging enabled you can see that the counter I track will be always 0, whereas if you recirculate using: recirculate({meta.count_struct.counter});
(line 137 which is commented), the metadata field will be copied and thus the counter will increase.
At least with the version of p4c I tried compiling that program with (the latest source code as of 2019-Jan-09), using the command p4c --target bmv2 --arch v1model issue1669.p4
(after renaming it to issue1669.p4), I looked at the BMv2 JSON, and the parameter to the recirculate operation there had been constant-propagated by the compiler to be a hard-coded value of 0x1, not a reference to the variable meta.count_struct.counter
. BMv2 simple_switch
neither complains about such BMv2 JSON files, nor will it preserve a metadata field that it does not know the name of in the recirculate operation.
This looks like the same behavior as reported in #1479 to me.
Sorry for making you rename the file... but github did not let me upload .p4
extensions.
Not sure if useful, but I can tell you that with a bmv2 from 2018-11-Sep and p4c from 2018-06-Sep this was not happening.
By the way, note that in the example in which this does not work I am recirculating the entire struct: recirculate(meta.count_struct);
and not the counter. When using the counter it does get copied.
For the moment the work around I found is or either recirculate all the meta fields (that seems to work), or create a tuple with all the individual fields you want to keep.
Someone like @mbudiu-vmw knows much better than I exactly what changes to p4c might change the results here, from one commit to the next. As another wild guess, recent changes involving flattening structs in internal front/mid-end passes of the compiler could perhaps have caused the change in behavior you are seeing.
The root cause is #1514, which means that the current implementation of all of the operations resubmit, recirculate, clone, and clone3 in P4_16 + v1model architecture programs, when asked to preserve metadata, is very fragile to precisely what optimizations the compiler makes in your particular source program.
It one would like to disable the recently developed flatten midend passes, please edit your backend's midend.cpp to remove the .h files and the flatten passes. Rebuild p4c and see if you still see the error.
For example, remove the following lines (and relevant .h files) from p4c/backends/bmv2/simple_switch/simple_switch/midend.cpp
new P4::FlattenHeaders(&refMap, &typeMap),
new P4::FlattenInterfaceStructs(&refMap, &typeMap),
Thinking about this particular example program a little bit more, there could be a "chicken and egg" optimization problem with p4c here.
It is replacing an occurrence of meta.count_struct
with a constant, based on the assumption that the field is initially 0. It is always 0, when a new packet arrives, but it can be non-0 precisely if/when a packet is resubmitted, recirculated, or cloned while preserving this metadata field from an earlier processing pass.
So the compiler is optimizing assuming the initial value of 0, in a program that specifically is asking for the value to be preserved for some packets.
A more conservative approach by the compiler would be to never assume that the initial values of metadata are 0. That would hinder optimizations when no recirculate/resubmit/clone operations are done by a program, I know, so it isn't the ideal solution.
A more precise kind of optimization based on initial metadata fields being 0 would be something like:
You could get even more fine-grained by tracking the above condition on a per-metadata-field basis. I am not necessarily recommending that, or even the refinement described in this comment. Just brainstorming here.
The metadata is an in
parameter, so it can have any value the environment chooses to give it.
We cannot assume it to be 0 ever.
@jafingerhut : I think that 1 is the id of the field list describing the fields that are recirculated. So there is no constant folding.
The json generated looks fine to me.
Wow. Sorry if I completely messed up my analysis of this issue above. Let me take a closer look.
OK, the BMv2 JSON does not have the problems I indicated above. It does generate a field_list, and refer to that field_list in the recirculate primitive operation, as Mihai said. Thanks for the correction to my mistake there, Mihai.
Attaching the issue1669.json.txt file that I generated using the version of p4c
generated from the latest source code as of 2019-Jan-09.
However, the field_list in the BMv2 JSON file that is used by the recirculate primitive contains only a temporary variable generated somewhere in the compiler or its bmv2 back end, I don't know where, which is not the original metadata field that the recirculate operation in the P4_16 source code says should be preserved.
It is as if this code:
recirculate({meta.count_struct});
is being replaced by the compiler with this code:
bit<8> tmp = meta.count_struct.counter;
recirculate({tmp});
That preserves the value of tmp
across the recirculate operation, but does not preserve the value of meta.count_struct.counter
across the recirculate operation.
In the attached file issue1669.json.txt, search for "field_lists"
to see the definition of the one field_list there, and search for "recirculate"
to see the one recirculate operation in there, and the assign
statement just before it that does the copying to tmp
described above.
issue1669.json.txt
It still seems to me that the root cause issue is #1514 -- the mismatch between P4_14 field_lists being explicitly represented in BMv2 JSON, as "references" or "names" of things, vs. the p4c P4_16 compiler treating the last operand to recirculate and similar operations as a list of current values.
I guess recirculation was never really implemented properly. It was a coincidence that it was working.
This is not really the same as #1514, at least not as the problem is described in #1514. There are two different issues here:
The first issue is easy to solve in p4-16 by keeping a copy of the data. The second one is harder; it does not really match the semantics of P4-16.
Here is one way this could be done, although it's not as pretty as the p4-14 solution.
We add a function recirculate_receive<T>(out T data);
to v1model.p4. This function must be invoked first thing in the start state of the parser if the user wants to recirculate. It copies the data that is recirculated in the specified values. The user can call recirculate<T>(in T data)
anywhere. The compiler checks that the two types T must be the same. However, the fields sent and received can be different. The compiler will make sure to compile this to an implementation that bmv2 can support.
The compiler will generate this code for p4-14 programs, so these should be automatically taken care of.
The same treatment will have to be performed for resubmit. We could reuse the same function to receive, we could call it metadata_receive
instead, assuming that one cannot do both resubmit and recirculate.
Maybe change metadata_receive
to recycle_receive
. I am used to a router data plane that uses a recycle queue for more packet processing.
But are all these changes really needed? (saying it from the ignorance). But, this used to work few months ago.
Adding that metadata_receive
would require a change in all the P4 programs that use resubmit and recirculate, in the spec, and does not seem very intuitive to newcomers.
Not for p4-14 programs, only for p4-16 programs. It was working by accident.
As I think I mentioned in the comments on #1514, any such suggested changes need to take into account the auto-translation from P4_14 programs, where the values of the metadata fields to be preserved are not the current values at the time the call is made, but the values at the end of the current ingress or egress pipeline in which the operation is invoked.
Also, the P4_14 program can have more than 1 recirculate operation in the source code, and more than one of them can be executed during the same packet's egress processing. Only the last such call's field list should actually take effect. Similarly for multiple resubmit operations, or multiple clone3 operations.
Keeping P4_14 delayed-effect-with-name-references semantics, P4_16 copy-value-in-now semantics, BMv2 behavior, and auto-translation from P4_14 to P4_16+v1model architecture, all at the same time, is doable, but unless I'm missing how to do it as a small fix, isn't a small fix.
The main question is how to express this behavior in P4-16. For example, using the scheme I proposed we cannot handle multiple recirculate calls unless they all have the same fields. Perhaps for this case we should allow recirculate_receive to have the union of all fields. The ones which haven't been written out will have unspecified values.
There is no need for a recirculate_receive operation in order to make recirculate work for P4_16+v1model that I can see, because v1model makes metadata an inout
parameter to the ingress parser.
parser Parser<H, M>(packet_in b,
out H parsedHdr,
inout M meta,
inout standard_metadata_t standard_metadata);
BMv2 JSON can already assign values to fields of meta
when a packet is resubmitted or recirculated, before it begins parsing for the next time.
The reason why this bug was filed is because BMv2 JSON is being told the wrong names of fields to preserve (auto-generated temporary names, not user-defined names).
I guess I should say that I can see a way that potentially a new recirculate_receive operation could assist in making recirculate work in P4_16+v1model and preserve metadata. However, if you want it to handle an arbitrary subset of fields of the meta
Parser parameter, it seems like that is already implemented for you in BMv2 JSON, but only if the names of the fields given match the names of the fields in meta
. No fields outside of meta
and standard_metadata
need to be supported (because the P4_14 spec says that you cannot specify packet header fields in the list of fields to preserve for resubmit, recirculate, or clone, if I recall correctly).
This is an important problem, so we should solve it in a good way. I will post several messages. First, P4 is designed such that one can "substitute equals for equals" without changing the meaning of a program. That means that from the compiler point of view the following are always equivalent, irrespective of the context:
void f(in bit value);
f(x);
and
bit tmp = x;
f(tmp);
As such, you can see why the compiler does not implement recirculate
in the way you would expect.
The way recirculate is used in p4-14 is really "call-by-name", where the name that is passed as an argument is important. In P4-16 we only have copy-in/copy-out, which is a form of call-by-value really. The name does not matter, only the value.
There are two ways to implement the semantics of recirculate from p4-14. The first one is to do explicit reading of values from the outside world in the parser, and an explicit writing of values to the outside world in egress. So the P4-14 program
recirculate(f);
Would be translated to
void recirculate_input<T>(out T data);
parser p() {
state start {
recirculate_input(f);
...
}
}
control egress() {
apply {
recirculate(f);
}
}
The second alternative is the way PSA does it: you have an additional metadata struct for recirculated data.
// architecture file
parser p<H, M, R>(out Headers h, inout M user_metadata, in R recirculated_metadata, inout standard_metadata)
// you can access the recirculated metadata at anytime
control egress<H, M, R>(..., out R recirculated_metadata, ...)
// you recirculate just by writing something to the recirculated_metadata, and perhaps calling recirculate() with no arguments
We can actually implement this solution in v1model as well, by providing an alternate definition of the switch architecture which has one more argument in each of the blocks. If the user wants to recirculate they will have to instantiate the architecture with recirculated metadata. Otherwise they can continue to use the old V1Switch.
I am only interested in solutions which do not break any of the existing P4-14 and P4-16 programs. We can make both of these work with a little more work in the p4-14->p4-16 translator and the simple_switch back-end. Please chime in with proposals and comments.
I cannot yet think of a strong motivation for one of the two approaches you describe over the other.
The more PSA-like one would presumably also need additional arguments for resubmit metadata, and clone metadata, with extra parameters for each one? In that case, yes, it does start to look like PSA, and a potential advantage of that is that it gets the compiler and BMv2 that much closer to implementing PSA.
Perhaps one way to generate comments/discussion is to add this to the agenda for the next language design work group meeting, and be sure potential attendees are aware of the size of the change this could represent.
I have added it to the LDWG table of things to discuss.
Mihai, in earlier comments you say these things: 'P4 is designed such that one can "substitute equals for equals" without changing the meaning of the program', which I think is intended to mean "substitute expressions for other expressions as long as those expressions evaluate to equal values". I suspect this gets into technical areas of programming language semantics which I am not fully educated on to discuss, but doesn't that conflict with "call-by-name" semantics?
Perhaps a more concrete way of asking the question is: "Does p4c compile P4_14 programs that use field lists 'by accident' rather than 'by design', because p4c's implementation assumes throughout that expressions can be substituted for other expressions if they have equal values?"
Or: What stops p4c from replacing a name in a P4_14 field list with another expression that has an equal value?
And answering the last of my previous questions, at least for the latest p4c as of 2019-Jan-18, the answer appears to be "nothing".
With the attached P4_14 program and the BMv2 JSON file produced via the command:
p4c --std p4-14 resubmit-bad-field-list-p414.p4
it exhibits two bugs I have seen in similar P4_16 programs (or perhaps it should be considered one bug -- not sure):
standard_metadata.egress_spec
is replaced with the constant number 3 in the attached BMv2 JSON file)["scalars", "tmp"]
and ["scalars", "tmp_0"]
as fields in the field list of the BMv2 JSON file, instead of the two fields of the mymeta
user-defined metadata.resubmit-bad-field-list-p414.p4.txt resubmit-bad-field-list-p414.json.txt
If you get the now-latest version of repository https://github.com/p4pktgen/p4pktgen it includes an updated version of the Python program tools/bmv2-json-check.py that will warn about the following kinds of problems in a BMv2 JSON file that you give it as a command line argument:
Any recirculate, resubmit, or clone operation that references a field list that contains one of the following kinds of fields in the BMv2 JSON file (the P4 source program is not used to determine this at all):
BMv2 simple_switch itself does not seem to make any such checks, and silently accepts such BMv2 JSON files.
I do not know if this is error-prone nonsense of an idea, or not, but perhaps one way to correctly compile the P4_14 field lists, at least in the cases when they are used in recirculate, resubmit, and clone operations, is to create special "name reference" objects in the compiler, that are only equal to themselves, and thus no equal-for-equal substitutions are possible for them? Or perhaps the only equal-for-equal substitutions would be for whole vs. parts, e.g. replacing a struct with a list of its named members.
There is no need to do this for field lists used in other contexts that I know of, e.g. digest message generation, or checksum/hash calculation, because in those cases it really is the value of the fields that matters, not the name references.
There is no such construct in p4-16. If we want to write recirculate in p4-16 we have to do it within the language.
I agree that what I spoke about in my last several comments here are not needed for p4-16.
Is it worth a separate issue to consider it for using p4c as a p4-14 compiler?
The compiler converts all p4-14 programs to p4-16. We want to deprecate p4-14, so we must be able to do this for all p4-14 programs.
I am trying to prototype a fix. I am looking at the v1model-special-ops-bmv2.p4 program that @jafingerhut wrote. This program recirculates the standard metadata. It looks to me like this should not be legal in general - the standard metadata is set by the environment and it describes the packet that is being processed. You should not be able to alter it arbitrarily.
If this assumption is correct, and we can only recirculate user metadata, then we may have an even simpler way to implement recirculation in v1model: the user adds an annotation on the metadata fields which have to be recirculated. When calling the recirculate()
function - with no arguments - these fields are automatically recirculated. We don't need to change anything in the v1model.
struct user_metadata {
bit<32> a;
@recirculate
bit<32> b;
}
...
control egress(... inout user_metadata u, ...) {
apply {
recirculate(); // recirculates the labeled fields
}
}
BTW: we could do the same trick to the PSA, simplifying its definition.
Note that the semantics would be closer to the P4-14 implementation, since recirculation would be done with the values at the END of the egress pipeline. In fact, calling recirculate()
merely sets a flag to cause the recirculation to happen.
I like this alternative, it looks very simple.
I agree that perhaps a P4 program should be prohibited from attempting to preserve standard/intrinsic metadata on a recirculate/resubmit/clone operation. At the very least, enabling it would seem to require specifying which fields could be preserved, and which could not, because preserving a field like instance_type that is intended to indicate whether a received packet was recirculated, resubmitted, or is new, sounds very error prone to put into the hands of the programmer.
Mihai, it sounds like what you are proposing is that for a P4_14 program with multiple resubmit operations, the user must only be allowed to specify one field_list, not a different field_list for each of the operations?
Or that when translating a P4_14 program with multiple different field_list's used by different resubmit operations, that the compiler will auto-translate that to a P4_16 program that specifies preserving the union of all fields in all such field_lists?
Others should add their feedback to such an idea, besides me, but my first reaction on hearing the idea is that it does simplify things for the programmer, at the potential cost of making it more difficult for the compiler to minimize the amount of state that must be preserved for a packet. The P4_14 and PSA approaches allow the programmer to help the compiler minimize such state, if they are willing to go to the effort of specifying multiple field_lists (for P4_14) or multiple independent sets of state to preserve (in PSA).
Yes, the compiler would compute the union. I don't think this is necessarily more wasteful in practice - in hardware implementations you have to allocate "wires" (or packet header slots) to carry any field the user may want, even if you do not carry these fields in all recirculate operations. If you multiplex fields on the same wires (or packet header slots) then you have to generate circuitry to demultiplex as well.
Moreover, I expect most programs will have 1 recirculate/resubmit/clone call in the entire program.
@jafingerhut : maybe if you have some time you can help craft a test case like the one you already wrote but which recirculates only user metadata.
Mihai, another thought on your idea -- For P4_16 programs, recirculate goes from egress back to ingress, and the ingress and egress user-defined metadata can be defined independently, and in general need not have any field names in common. I agree it is likely good coding practice not to name the same value with multiple different names when you can easily make them the same, but in terms of your proposal, unless we require the names of preserved fields to match between ingress and egress user-defined metadata, the annotations you propose do not contain enough information to specify the desired preservation behavior.
That is an issue at least for recirculate (from egress, back to ingress), and ingress-to-egress clone (the other direction).
This comment is I guess really only an issue for PSA, not necessarily for P4_16+v1model arch, where the user-defined metadata is currently required to be identical between ingress and egress.
I actually think this is not a real problem. In general, if a field is not recirculated its value will be garbage. So you will just get more fields than expected in some cases, but you probably will ignore them. The same argument applies to using the union of the fields.
We could also have two different annotations: @recirculate
and @resubmit
, but I also think that in practice one will be sufficient. A field can have any number of annotations.
Finally, we could also define the semantics of recirculate()
to recirculate all of the user metadata if there are no annotations at all.
I have conversed with a few P4 customers and they tell me, do not use any annotation in the P4 program unless there is no alternative. The backend has to be intelligent. Thus, I would design explicit data structs, flags, and use the API's for clone, clone3, resubmit etc. to track input user metadata. For a general case, where ingress and egress user metadata are defined independently, the pipeline design (section 4 of P4-16 spec) ties the two together.
control pipeX (user_metadata ingress_m, user_metadata egress_m, ...) {
}
Mihai, suppose the ingress user-defined metadata and egress user-defined metadata in a P4_16 PSA arch program are completely different, and you put @recirculate annotations on 5 of the fields in each of them. How is the compiler supposed to match up the 5 egress fields to the 5 ingress fields? By field name?
This applies to
resubmit
and probably toclone
(Did not try).Metadata fields are not copied to the recirculated packet unless you copy the entire metadata struct or you create a tuple of individual fields. For example if you have:
The metadata will be copied in the following cases:
However, I remember (and just tested) that with older versions of bmv2/p4c copying only some structs from the main metadata worked. And the following does not work anymore:
Is this a bug or it's an intended behaviour ? I personally think that the previous way made it quite easier, for example when you want to copy an entire structure that has 10s of fields.