p4lang / p4-spec

Apache License 2.0
177 stars 80 forks source link

P4_16 Should the order of annotations be defined to be significant? #715

Closed jafingerhut closed 11 months ago

jafingerhut commented 5 years ago

This question occurred to me while reviewing the P4Runtime API specification, where annotations are explicitly represented in a P4Info file that is auto-generated from a P4_16 source program and used to represent the control plane API of a P4 program.

It preserves information about most (maybe all) of the annotations from the source program.

The most relevant part of the P4_16 v1.1.0 language spec I have found so far is this statement in Section 18.1 "Key-Value Pairs in Annotations"):

"If a given key appears more than once, then the latest value will replace all earlier ones."

I understand that this statement applies to multiple key-value pairs inside of one annotation, but the next section also shows an example of a single instantiation of main with multiple @pkginfo annotations on it. It does not explicitly say what should happen if the same key appears in multiple _different @pkginfo annotations, but one would perhaps guess that the "latest value will replace all earlier ones" might apply in that situation, too. Stating so explicitly might be helpful.

In general, is it perhaps worth defining whether the order of annotations on the same P4 object is allowed to be significant, or can never be significant? Between those two choices, the example above suggests that perhaps it can be significant in at least some cases, and perhaps a blanket statement that P4Runtime P4Info should always preserve the order of annotations from the source code would help people rely on that fact?

antoninbas commented 5 years ago

One quick remark. The statement you quote:

"If a given key appears more than once, then the latest value will replace all earlier ones."

does not hold for the reference p4c implementation. p4c uses an IR::IndexedVector to represent the KV annotations. This data structure doesn't seem to support duplicates and a program with duplicate keys will be rejected at parse time (IR construction time).

mihaibudiu commented 5 years ago

In general we allow many annotations with the same name on an object. Keys in a key-value annotation should be unique, though. Since annotations have no semantics we do not impose too many constraints in the spec; for each annotation we may have specific constraints. E.g., it does not make sense to have two @name annotations on an object.

jafingerhut commented 5 years ago

Would it be reasonable to change this sentence in the latest P4_16 spec:

If a given key appears more than once, then the latest value will replace all earlier ones.

to something more like the following?

It is an error if a given key appears more than once.

@chrispsommers Any thoughts on such a change in the spec? In this case, the only reason I am proposing it is to more closely match the current implementation, but you may have objections based upon hopes or plans that the implementation would behave as the spec is currently written? (which I believe you wrote this section?)

mihaibudiu commented 5 years ago

yes, I think we should do that.

jafingerhut commented 5 years ago

I created a PR for that suggested change.

However, the following questions I mentioned in the original issue still remain:

In general, is it perhaps worth defining whether the order of annotations on the same P4 object is allowed to be significant, or can never be significant? Between those two choices, the example above suggests that perhaps it can be significant in at least some cases, and perhaps a blanket statement that P4Runtime P4Info should always preserve the order of annotations from the source code would help people rely on that fact?

@liujed Pinging you because if the purpose of free-form annotation values is for things like assertions, would you expect that if there are multiple such assertions on a single P4 object, e.g. an assignment statement, that the compiler would preserve the relative order from the source code in any intermediate forms it produces that preserve those annotations? Or would you be content if the compiler was free to reorder those annotations in its output, relative to the order they appear in the source code?

mihaibudiu commented 5 years ago

The compiler is the consumer (and sometimes producer) of many annotations. So I don't think we need to make any promises with respect to the order of annotations to P4 programmers. This would be relevant only if we used the compilers are source-to-source translators (which we can), but that's not part of the spec.

jafingerhut commented 5 years ago

But annotations are also part of the output of the compiler, e.g. for control plane API generation, and perhaps other tools that would make use of the new free-form value annotations that @liujed is proposing. If he is thinking "of course the relative order of annotations in the P4 source code should be preserved in the compiler output, because my new application needs it", and you are thinking "of course no one should rely on that order", then we have a disconnect where a few sentences in the spec might help to avoid trouble in the future.

mihaibudiu commented 5 years ago

The API generation is technically part of the compiler, and it only cares about a small number of annotations. We should follow the example of other languages with annotations, such as Java and C#. How annotations are processed should be specified on a case by case basis. Currently we don't have any use case where multiple annotations on an object matter - each annotation is processed separately.

chrispsommers commented 5 years ago

@jafingerhut I'm OK with your proposal about uniqueness of kv-pairs and also approved @antoninbas similar proposal for the P4Runtime spec. Regarding ordering, I'll defer to @mbudiu-vmw 's opinion which seems sensible.

liujed commented 5 years ago

I think we should aim for maximum flexibility of annotation semantics. Why not specify that there may be multiple annotations of the same kind on an object, and that the order of annotations on an object is allowed to be significant, even across different kinds of annotations? (Individual annotations, of course, may refine this. For instance, the @name annotation might disallow multiple uses on the same object.)

My specific use case of assume and assert annotations is an example where order matters, and where multiple kinds of assumptions are processed together. For example, the assertion below might fail if the annotations were reordered.

@assume(x == 2)
@assert(x + 1 == 3)
...

If the spec were to dictate that order must never matter, one workaround might be to attach each of my annotations to a separate empty statement, thereby imposing an ordering. This raises the issue that AST transformations in the front- and mid-ends might alter annotation semantics (e.g., statement reordering), but that's a whole different can of worms.

mihaibudiu commented 5 years ago

I agree that we want maximum flexibility. Note that empty statements do not have annotations. Only BlockStatements do, and even this is very annoying for the optimizer, since it cannot move code across these blocks. This is the main problem with annotations: we don't know what they mean, so we cannot write code transformations that will be correct for all annotations.

jafingerhut commented 5 years ago

At least for the specific examples of @assume and @assert annotations, the fact that they might hinder the highest levels of optimization possible by a P4 compiler are the reason why in this issue https://github.com/p4lang/p4c/issues/1548 I suggest that such things, by default, be treated by the compiler as if they were not present in the program, which enables the most choices for compiler optimization possible. Only if some new option were provided to the compiler should such assertions be paid attention to at all, because correctly implementing them likely requires treating them as if they were "reads" or uses of any variables mentioned in their expressions, which prohibits some forms of reordering possible in some cases if those expressions were absent.

The general point that some kinds of assertions could benefit from preserving their source code order, at least for some compiler runs (i.e., perhaps only compilation runs with additional options enabled), seems worth at least noting.

liujed commented 5 years ago

Sure, by "empty statement", I meant an empty block statement: @anno { }. As I recall, block statements are currently the only kinds of statements that can have annotations attached. Is it worth changing the spec to allow annotations on all statements?

Designing a reference compiler that supports annotations with unknown semantics is indeed a challenge; I'm not sure how to best address this. One strategy might be to just do those transformations necessary in the front and mid ends to obtain the back end, and let back-end implementers do the heavy optimizations. To make their lives easier, we could supply a suite of standard optimization passes that they could invoke.

Another strategy, inspired by the Polyglot compiler, might be to make the front and mid ends more easily customizable. For instance, we could use the factory pattern to provide hooks for back-end implementers to easily override problematic front/mid-end passes with custom implementations that preserve annotation semantics. These hooks would be useful in general—they would allow someone to swap in a "better" constant-folding pass, for example.

These strategies aren't mutually exclusive, of course, but I think both might involve non-trivial changes to what we have now, and would certainly affect existing back-end implementations.

jafingerhut commented 5 years ago

@liujed Just an off-the-cuff suggestion here regarding assert and assume implementation, given the potential difficulties with implementing them using annotations, and preserving those annotations throughout the steps you hope they will. If they were new statements in the source language, similar the P4_16 verify statement but perhaps with simply one boolean expression argument, even though it would be extra work of one kind to implement them (i.e. adding primitive operations or whatever, similar to verify, in both p4c and perhaps also behavioral-model, for software simulator support), it could potentially be less work and trouble than keeping the annotations as you wish.

mihaibudiu commented 5 years ago

I agree. I think they should be just extern functions.

cc10512 commented 5 years ago

I think we can clearly define the semantics for these as statements. Why make them externs? (and therefore opaque to the compiler).

mihaibudiu commented 5 years ago

The compiler already understands the semantics of some functions, like the methods of packet_in. Everything that is in the "core" library can be hardwired within the compiler. Why introduce a new statement if these behave like procedure calls without side-effects? We would need to revisit all our passes to make sure they support the new statements; instead we can treat them like procedure calls and change almost nothing. It pays off to have the IR be as simple as possible.

mihaibudiu commented 5 years ago

So what is the conclusion of this discussion? My answer is "it depends on the annotations".

mihaibudiu commented 4 years ago

Do we close this issue or do we need to discuss this further in the LDWG?

liujed commented 4 years ago

It sounds like the consensus is that ordering of annotations is allowed to be significant. To me, this means that the compiler should preserve the ordering of annotations. I think it already does for unstructured annotations. With #796, I think the compiler should preserve the ordering of all annotations: structured, unstructured, and their interleaving.

mihaibudiu commented 4 years ago

I will then put this on the LDWG list for a short discussion. The problem is that the compiler can in principle modify, insert or delete annotations too; I don't know what is the exact promise it should keep, and it probably can only apply to user-defined annotations.

stefanheule commented 4 years ago

Has this every been resolved? Can we write down if the order is meaningful or not in the spec?

mihaibudiu commented 4 years ago

I don't think we have reached a conclusion. The compiler never reorders annotations at this point.

jnfoster commented 11 months ago

In the interest of tidying up the set of active issues on the P4 specification repository, I'm marking this as "stalled" and closing it. Of course, we can always re-open it in the future if there is interest in resurrecting it.