Open jafingerhut opened 2 years ago
I'm not wedded to defining a new direction for action parameters, but it seemed to me like a semi-natural way to express this. This explanation will explain the syntax in that way, but if you have another syntax that achieves the same goal that you believe has advantages, please propose it.
As a very short example, suppose there was a new direction other than the existing in
, out
, inout
directions defined in the language spec today, and it was called rmw
, which I chose for the sake of example because it is the initials of "read, modify, write". The intent is that this direction would only be defined and supported as the direction of an action parameter. I cannot imagine any use yet for using this new direction for other things like functions, extern functions, extern methods, sub-parsers, or sub-controls, because the new behavior is intended specifically for actions of tables.
The new rmw
direction would also not be defined for actions that are not associated with tables, as again, I cannot imagine a useful way to define the new feature in that situation.
When an action a1
is an action of a table t1
, and one or more of a1
's parameters has direction rmw
, it means:
rmw
it is legal within the action body to assign a value to that parameter.When a1
is executed because an entry of the table was matched, and the entry had a1
configured as its action, the body of the action would be executed. When the body of the action was finished being executed, the final values of all rmw
parameters would be "written back" to the entry, modifying their values. Any later packet matching that same entry would see the modified values as written by the last packet that caused the action to be executed for that same table entry.
This would require some definition of what happens in concurrent execution scenarios, but the intent is that effectively all or as much of the action body as required would effectively "lock the table entry E1" during its execution, and any changes made to rmw
parameters by one execution of the action on entry E1 would be visible to any later execution of the action on that same entry E1. That or something similar could I think be used to define the correct behavior, but leave open the precise implementation mechanism as any mechanism that achieves the same behavior.
Example of syntax with new direction rmw
:
action a1 (bit<8> foo, bit<16> bar, rmw bit<32> max_seq_seen) {
hdr.ethernet.etherType = bar;
usermeta.x = foo;
if (hdr.tcp.seqNo > max_seq_seen) {
max_seq_seen = hdr.tcp.seqNo;
}
}
Example of a different syntax mentioned by Nate Foster and/or Mihai Budiu, with NO new direction rmw
, but instead define a new modifier for local variables inside of action bodies state
, which is similar to C/C++'s static
, except that the state thus defined is created independently for each table entry that has been added with the enclosing action name:
action a1 (bit<8> foo, bit<16> bar, bit<32> initial_max_seq_seen) {
// logically, the following initialization assignment only occurs on the first time that this
// action is invoked, per table entry that uses this action.
state bit<32> max_seq_seen = initial_max_seq_seen;
hdr.ethernet.etherType = bar;
usermeta.x = foo;
if (hdr.tcp.seqNo > max_seq_seen) {
max_seq_seen = hdr.tcp.seqNo;
}
}
@mariobaldi Created this issue for the P4 language work group, to see if I can get the ball rolling. For the background of others, this feature request originates in the architecture work group, and seems to me to be a natural way to express data plane modification of per-table-entry state.
Note: TNA defines a way to do this using DirectRegister and RegisterAction externs, plus abstract methods.
I will check with Tofino compiler developers whether this proposal is "just another syntax" for this behavior, easy to support, or if there are any differences in expressive power that I am not yet aware of. The syntax of this proposal definitely seems noticeably easier to read and understand, to me personally.
I believe that one noticeable difference between this proposal and using a DirectRegister
extern is that with this proposal, if you want three fields A, B, C to be rmw
for action a1
of a table, but you want 1 field D to be rmw
for action a2
of the same table, it is quite natural to express that using the proposed syntax. My understanding from talking with Vladimir Gurevich is that at least today's TNA definition of DirectRegister
only lets you express that all of the actions for the same table must have the same set of rmw
fields as each other.
Directions were originally intended to indicate calling conventions -- a defensible language design choice. (Ab)using directionless parameters to indicate the control-plane parameters was always controversial. Experience has shown that it is confusing to new P4 developers -- not ridiculously so, but it's not natural. In my opinion, it was a mistake. (At the time P4_16 was being developed, we considered having two sets of braces, which would have been arguably more natural, but some on the LDWG had parenthephobia.)
Anyway, I really like the idea of having syntax for initializing per-table-entry state for direct mapped objects. But I wonder if further abusing direction parameters is the best way to go, or if we are continuing down a bad syntactic path.
I am certainly open to other syntactic ideas, certainly. I know there is a preference/guidance against using annotations for things that significantly change the meaning of the program if they are removed, so this seems like a poor use case for them. If you or others have thoughts on syntax besides a new direction besides in/out/inout, please do describe them.
@jnfoster I recall you mentioning something like this about directionless parameters for actions being a choice you wish had been made differently. Perhaps I misunderstood, but I thought you meant something like "I wish instead of action parameters having no direction keyword meant 'comes from the control plane', instead we had an explicit direction keyword that meant 'comes from the control plane'". But perhaps you had something else in mind?
Here are my thoughts and even how I would have done the original control plane based arguments.
Having directionless meaning two different things is definitely confusing.
I would have used roentry
and rwentry
as the two direction keywords for actions (I am not 100% set on these two names).
I can think of how we can still introduce roentry
and still be backwards compatible (and even if they use that as a type name) and have an option to warn for directionless arguments for actions (maybe even by default).
The grammar becomes direction typename argname ROENTRY argname
Note I am ok with not adding a specific direction and leaving directionless because it is almost water on the bridge now.
What would be a difference between such an action and a control or a function?
@vgurevich asks: "What would be a difference between such an action and a control or a function?" It would be in an action, and thus could be an action of a table. Controls and functions cannot be actions of a table. Controls cannot be invoked from within actions in the current language definition.
@apinski-cavium I agree that if we want to preserve backwards compatibility with very very commonly written P4 code, we should not make a change where we mandate that directionless parameters to actions must have a new direction keyword in front of them. I do think something like your roentry
could be introduced in the language spec, and if someone uses it in their program, its meaning would be identical to what a directionless action parameter is today (or perhaps a subset of that meaning, e.g. it MUST come from the control plane, never the data plane, could be considered for such a new direction keyword).
The reason directionless has two meanings is historical, to support p4-14, where there were no functions, and an action could be used both as a function and as an action. The same action could be called from two different contexts: other actions or tables.
I would actually use a declaration and not a parameter:
action a(parameters) {
state bit<32> value = initializer;
}
I have generalized the name of this issue to be closer to the problem statement, and the title no longer mentions the specific implementation approach of adding a new kind of action parameter direction.
I have created a more explicit example in the style of this comment: https://github.com/p4lang/p4-spec/issues/1177#issuecomment-1306301017 and added it this comment above: https://github.com/p4lang/p4-spec/issues/1177#issuecomment-1298144022
@mariobaldi Please take a look at the two syntax examples at the end of this comment, particularly the last one, to see what you think: https://github.com/p4lang/p4-spec/issues/1177#issuecomment-1298144022
Upon thinking about it a bit more, perhaps it's better to make it look like a new kind of parameter that looks behaves like an inout parameter, but which is provided by the dataplane and not by the program.
action a(bit<32> field, stateful bit<64> token = initial_value) {
....
}
The semantics is special in that the initial_value is a default_value, but the value that the state starts with when the action is installed. The initial_value is optional, if missing the control-plane has to specify it.
Maybe I do not understand well what @jnfoster and @mbudiu-vmw have proposed, but I have the impression that the parallel with inout has brought us away from the initial intent. token in Mihai's example and max_seq_seen in Nate's example should be initialized with a value that is in the table entry and then once the action execution returns the current value should be written back into the table entry. It is not clear to me how these two proposed syntaxes are not allowing for this.
In the case of max_seq_seen how does the compiler know in which position in the table entry the value should be stored? Does it infer it from the fact that max_seq_seen was initialized with initial_seq_seen, which was the third action parameter? If so, I find it not very intuitive, which will lead to code not easily legible.
In the case of token (Mihai's proposal), when initial_value is not present, what do we mean that the control plane must specify the value? If we mean that the control plane will have written the value in the second action parameter in the table entry associated to the action execution, then isn't this the same as the initial proposal with rmw (now called "state")?
@mbudiu-vmw In this comment https://github.com/p4lang/p4-spec/issues/1177#issuecomment-1310741880 you say: "but which is provided by the dataplane and not by the program". I don't think there is a desire for the initial value to be supplied by the data plane here.
And if we remove that part of your proposal, it appears that the only difference that remains is whether the keyword is my initial placeholder rmw
or your stateful
.
Maybe I am missing something, though.
The initial value is actually supplied through add_entry extern when an entry is created by the data plane, and through the API (TDI/P4Runtime) when an entry is created by the control plane.
This topic has been discussed again recently in the 2023-Jan-30 Architecture work group meeting. If anyone has ideas for syntax that do not involve using a new direction name (perhaps abusing a new direction name, agreed), those ideas are very welcome.
One thought: What if 'rmw' was not a direction like in/out/inout, but instead a separate category of qualifier that can be used in defining the parameters of an action, e.g. like const
can be used in C++ separate from the type of the parameter. This is perhaps a distinction that makes no difference syntactically, though, since it would "look like" a direction, and I cannot think of a reasonable definition for a parameter declared like a1(rmw in bit<8> a)
or a1(rmw inout bit<8> a)
, i.e. the only combination of direction and rmw
that I know how to define reasonably is rmw
with no direction, which looks like a1(rmw bit<8> a)
.
Perhaps that is what Mihai was intending with his stateful
keyword proposal here: https://github.com/p4lang/p4-spec/issues/1177#issuecomment-1310741880 Again, my comment on that proposal is that it seems like it is the same as the original proposal but using the word stateful
in place of rmw
.
Here are some notes I took on this topic during the 2023-Feb-27 P4.org architecture work group meeting, discussing PNA, where this issue originated from.
We talked about the idea of having a per-table-entry lock on action data that was writable, that would be held for the duration of the execution of the action.
Andy: More precisely, I would say that the P4 language spec should not mandate that an implementation must have a per-table-entry exclusive lock held for the entire duration of the action's execution. Instead, perhaps it could be stated as "an implementation should behave as if this were the implementation", leaving open to the target how it actually achieves this behavior.
Mario: What about an annotation that disables the per-entry-lock for that table?
Example action, written assuming that the keyword rmw
indicates an action parameter that is writable from data plane, and written values are remembered in the table entry, readable by future packets matching the entry:
action a1 (rmw bit<8> p1, rmw bit<8> p2) {
/// 17 statements here that don’t write to action parameters
p1 = hdr.ipv4.src_addr; // line 1
p2 = hdr.ipv4.dst_addr; // line 2
}
With such an annotation that disables the per-entry-lock that lasts for the duration of the entire action body, the following order of events would be possible:
Perhaps add extern functions like these?
action a1 (rmw bit<8> p1, rmw bit<8> p2) {
// 17 statements here that don’t write to action parameters
acquire_entry_lock();
p1 = hdr.ipv4.src_addr; // line 1
p2 = hdr.ipv4.dst_addr;
release_entry_lock();
// 20 more statements that don’t write to action parameters
}
Note: It doesn't seem like a P4 compiler would need to be very sophisticated to notice that the first 17 statements of an ation like the example above do not write to action parameters, and thus need not have lock-like protection around them, i.e. those statements can be done outside the critical section.
Note: It doesn't seem like a P4 compiler would need to be very sophisticated to notice that the first 17 statements of an action like the example above do not write to action parameters, and thus need not have lock-like protection around them, i.e. those statements can be done outside the critical section.
Agreed, furthermore it's important to note that reads should be considered too (only writes are mentioned).
Final state is NOT any state that can result from a per-entry-lock for the duration of the action body.
I can't see the user's motivation for this - performance optimisation?
@thomascalvert-xlnx The only motivation that makes sense to me is giving the developer knobs that can help increase performance, without sacrificing correct behavior.
As a perhaps weird example, that I don't know if anyone would want to implement or use, one could imagine an action that looked like this:
action a7 (rmw bit<8> p1, rmw bit<8> p2) {
// statements here that don’t access rmw action parameters in any way
acquire_entry_lock();
// code block 1 that accesses rmw action parameters
release_entry_lock();
// more statements that don't access rmw action parameters
acquire_entry_lock();
// code block 2 that accesses rmw action parameters
release_entry_lock();
// more statements that don't access rmw action parameters
}
Of course such an action definition would only be correct if the P4 developer had done their own independent reasoning about the possible behaviors that could result, and judged that it was suitable for their use case.
If action definition like a7 above are of any interest, and motivate the suggestion to provide explicit acquire/release lock primitives, then it seems to me unlikely that a P4 compiler would ever be "smart enough" to figure out that a7 should be the result of an action written without those acquire/release primitives written as shown.
Notes from a P4.org meeting of 2023-Mar-17 on this topic alone available here: https://docs.google.com/presentation/d/15vZJETWr0Cyht-dcQFnat28A3c9FoY7Z/edit?usp=share_link&ouid=113165450274035133953&rtpof=true&sd=true
p4c
.Action: we will wait until there is an implementation of this feature, and some experience writing programs with it. But the LDWG endorses the direction and looks forward to seeing an implementation and full proposal.
One other thought... We can also add a backwards compatible ro
direction, which will be equivalent to "no direction". This way people who plan to use rmw
will be able to freely mix read-modify-write and read-only action data. Without it, read-only fields will always have to precede read-modify-write fields.
@vgurevich I am not sure why you say "Without it, read-only fields will always have to precede read-modify-write fields."
If you look at the proposed PR with changes to the language spec for this new feature here: https://github.com/p4lang/p4-spec/pull/1239
you can see that it says:
"All parameters that are either directionless or declared rmw
must appear after any parameters with a direction.
Directionless parameters and those declared rmw
can appear in any order relative to each other."
Do you see any issues with that proposed text?
Oh, I see! In that case we are good!
Does ir make sense to use the existing inout
direction rather than introducing a new rmw
keyword? This sort-of fits with the idea that "directionless" parameters are implicitly in
in some sense -- just losing the control-plane vs data plane nature of directionless parameters. Since actions (entries) come from the control plane (excepting static const entries), all action params implicitly come initially from the control plane.
My main concern with using inout
for this new purpose is that I believe it would be a backwards-incompatible change. Introducing slightly new syntax seems to make it straightforward to preserve the meaning of existing programs, and define new meanings for programs that were not legal before.
@ChrisDodd -- the main problem with reusing inout
is that it will become impossible to figure out whether to include those fields into the control API generation or not. For now the rule is: "no direction -- add the field to the control plane API, otherwise do not". WIth the introduction of rmw
it will become more convoluted, i.e. "no direction or rmw
-- add the field to the control plane API, otherwise not", but at least still manageable.
Personnel
Design
Implementation
p4-spec
: A draft of a PR to the language spec that is currently only intended to help define this feature for a proposed experimental implementation in p4c: https://github.com/p4lang/p4-spec/pull/1239p4c
:Process
p4-spec
:p4c
:=======================================