Open jafingerhut opened 4 months ago
I think compile-time-known-value-ness should be a property of P4's type system. Although it's true that we could, in principle, determine the value of many expressions at compile time, using various complex static analyses, it would be unusual to see this done in a language design. So I fear it would be confusing for programmers to go down this route.
In particular, if compile-time-known-value-ness is computed from type information alone, then it will be robust to various type-preserving refactorings the programmer might perform on their code. For example, in this program,
bit<8> i = 0;
i = i + 1;
the value of i
is not compile-time known because local variables are not considered compile-time known.
However, the same might not be true under proposals hinted at here, using complex static analyses. For example, using a static analysis we can certainly determine the value of i
at the end of the code block.
But now suppose we add a statement between the declaration of i
and the increment:
bit<8> i = 0;
i = i | hdrs.ethernet.srcMac[0:7];
i = i + 1;
Now the value certainly cannot be determined at compile-time, as it depends on packet data. Or, as another example, suppose we invoke an extern with an out
or inout
parameter, so that it overwrites the value of i
.
bit<8> i = 0;
some_hash(i, ...);
i = i + 1;
Even if the backend understood the semantics of the extern, the front-end would not. So again the value of i
could not be determined at compile time, which is what we need to do to type check the program.
The distinction between a static analysis and a type system might be subtle. In general, type systems are compositional whereas a static analysis need not be. In this case, the only reasonable static analyses would need to be non-compositional to determine the values of expressions at compile time. This illustrates another argument for not going down this path is that the static analysis itself would take time to implement and -- assuming it were not perfectly precise -- would be confusing for programmers to understand and reason about.
To put it another way: compile-time known value != value determined at compile-time.
I think compile-time-known-value-ness should be a property of P4's type system. Although it's true that we could, in principle, determine the value of many expressions at compile time, using various complex static analyses, it would be unusual to see this done in a language design. So I fear it would be confusing for programmers to go down this route.
I agree. Also, with sufficiently advanced static analysis it would quickly become quite hard to define in the specification where it should be able to derive that something is compile-time known. Especially since any such static analysis would almost surely have to be heuristic, just imagine a program like this:
bit<1> i = (bit<1>)((hdr.x.a || !hdr.x.b || hdr.x.c) && (hdr.y.a || !hdr.x.a || ...) ...)
x = hdr.v.v1[i:i];
Here, i is compile time know if and only if the expression that defines it is a tautology (i = 1
) or is not satisfiable (i = 0
), otherwise it depends on the packet. Sure, a sufficiently complex static analysis (capable of sat-solving) would be able to detect that, but one would not want to implement and to wait for SAT solver when compiling :-).
Perhaps it would be good to include a statement along the lines of
Expressions that are not compile-time-known values that are determined to be constant after some optimization passes may be accepted as compile-time-known values without a diagnostic or error.
During the 2024-Jul-01 language design work group meeting, Jonathan DiLorenzo was asking some questions that I believe might be effectively the same as those described below (@jonathan-dilorenzo feel free to reply if I have misrepresented your questions).
There are P4 programs where it is possible to prove that at a particular point in the execution flow, the value of a run-time variable must be equal to a particular compile-time known value. For example:
I am not attempting here to enumerate all of the conditions for a run-time variable to be provably equal to a single compile-time known value at an arbitrary execution point within a P4 program, only to give some simple examples where it looks easy to do so, and some where it seems impossible to do so.
The discussion during the 2024-Jul-01 LDWG meeting seemed to be related to the questions raised in this issue. Even though it was during a discussion on loops in P4, I believe the discussion was mostly around what a sufficiently advanced compiler might be able to infer about the P4 program after loops had been unrolled.