p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
665 stars 442 forks source link

Allow p4c-bm2-ss and bmv2 to apply same table multiple times per packet #457

Open jafingerhut opened 7 years ago

jafingerhut commented 7 years ago

It probably is, but I wanted to confirm.

In a few test programs, every attempt I have made to make a program apply a table twice, either apply'ing it twice in same control apply block, or invoking a control that apply's a table twice from other control blocks, causes p4c-bm2-ss to give an error about the table being involved in a cycle.

Note: p4test gives no such error, which is what I would expect.

I understand exactly why some high performance hardware architectures have restrictions against this, but it seems a little more strange that a pure software model like bmv2 would, unless it is to give P4 program developers that restriction 'early' before they move to an architecture where it is necessary.

antoninbas commented 7 years ago

bmv2 was mostly designed and written before P4_16 existed. Internally, control flows are represented as a table graph, mostly for legacy reasons. Edges between table nodes are labelled with actions. Let's look at some examples and assume I have a table t1, with 2 actions a1 and a2, and some other tables t2 and t3.

t1.apply()
t2.apply()

is represented by graph

t1 --- a1, a2 ---> t2

Similarly,

switch(t1.apply().action_run) {
  a1 : t2.apply();
  a2 : t3.apply();
}

is represented by graph

t1 --- a1 ---> t2
   --- a2 ---> t3

Now let's consider the following

if (x) {
  t2.apply();
  t1.apply();
} else {
  t2.apply();
}
t3.apply();

There is no way I can represent this control flow with my graph format, except if I duplicate the t2 vertex. If I duplicate t2 then from a control plane perspective I have to make sure that the duplicates are programmed symmetrically.

All of this to say that because of the bmv2 design, supporting multiple lookups in the "same" table, would require one of the following:

  1. change the control flow representation in bmv2
  2. duplicate the table in the bmv2 backend and make significant changes to existing control plane code. Both of these are non-trivial (if I had to choose one, I would go with 1. though).

But the point you make at the end of your question is also very important in my mind. We want P4 programmers to keep this restriction which I expect to be common for hardware targets in mind when experimenting with P4 on bmv2. Note that even on hardware targets we should be able to duplicate tables, at the cost of memory resources.

In many cases, you can rewrite your program to avoid the duplicate apply; but I am aware that it is not true for every program.

jafingerhut commented 7 years ago

Thanks for the detailed answer, Antonin. Very helpful. I am closing this issue, as at least I am satisfied with not only the reason why it is a restriction in software historically, but also getting people accustomed to at most one lookup in a particular table per packet is a good thing for high performance targets.

It might be nice if the compiler were sufficiently smart to detect accesses to the same table in mutually exclusive code branches and allow that, like for this example of yours with table t2:

if (x) {
  t2.apply();
  t1.apply();
} else {
  t2.apply();
}
t3.apply();

But it seems perfectly reasonable to me that allowing such a thing is target-dependent.

jafingerhut commented 7 years ago

I had a side communication with someone suggesting it might be nice to reopen this. I can definitely see some scenarios where even if some target architectures do not allow this, having the option for bmv2 to be more general than they are could be useful. I don't have any plans to submit a change soon to p4c-bm2-ss and behavioral-model to enable this generalization, but wanted the issue left as a placeholder if/when I do.

Also agreed, Antonin, that your approach #1 would be my preferred way of achieving this goal.

jafingerhut commented 7 years ago

I would like to add 'enhancement' label to this issue, but I don't think I have permission to do so.

mihaibudiu commented 7 years ago

p4c-bm2-ss allows multiple calls of the same table if it can fit the control-flow graph within the BMv2 constraints.

jafingerhut commented 7 years ago

@mbudiu-vmw: Understood that there are special cases where it is allowed. The purpose of this enhancement is the hope of generalizing BMv2 so the current control-flow graph restrictions are no longer a limiting factor.

mihaibudiu commented 7 years ago

The issue should be filed with BMv2.

cc10512 commented 7 years ago

BMv2 supports this. It is the compiler that needs to do the analysis and accept the program.

mihaibudiu commented 7 years ago

@antoninbas wrote:

"There is no way I can represent this control flow with my graph format, except if I duplicate the t2 vertex"

The program representation in BMv2 JSON uses table names for the control-flow graph nodes. There cannot be two nodes with the same name.

jafingerhut commented 7 years ago

It isn't a small change, I am sure, but @antoninbas also wrote that this enhancement would be possible if you "change the control flow representation in bmv2". There is no time-urgency for this suggested enhancement, certainly not before the May 2017 P4_16 release. I'm in this for the long haul :-)

jafingerhut commented 7 years ago

Sorry, I missed some of the comment thread here. It seems to me that enhancing this requires changes in both bmv2 and in the compiler. I can file an issue in the behavioral-model repo as well, but I'm sure the enhancement wouldn't be complete until both had been enhanced.

antoninbas commented 7 years ago

I can confirm that for supporting multiple executions of the same table in the general case, bmv2 needs to be updated, as well as the bmv2 JSON format (and therefore the compiler). I would indeed suggest waiting before implementing this, at least until the legacy bmv2 compiler (https://github.com/p4lang/p4c-bm) is deprecated; otherwise it would need to be updated as well to support the new JSON format.

fruffy commented 5 months ago

@antoninbas This is an old issue but is becoming interesting for using P4 as a modelling language. Regarding

change the control flow representation in bmv2

How much work is this? Do you still recall or have an estimate?

antoninbas commented 5 months ago

@fruffy the table definitions need to be extracted from the pipepeline definitions in the bmv2 JSON so that they are independent. The "control flow", which is defined by the next_tables attribute in the JSON can no longer be a part of a table definition.

Today a "pipeline" in bmv2 is essentially just a pointer to the first table in a control: https://github.com/p4lang/behavioral-model/blob/5f1c590c7bdb32ababb6d6fe18977cf13ae3b043/include/bm/bm_sim/pipeline.h#L59. After applying the table, we can jump to the next table based on which action was used (or alternatively based on whether the lookup was a hit / miss). To support this change, a table will no longer be responsible for jumping to the next table node. Instead, the pipeline will own the control flow. It should still be represented as a DAG IMO, but a "control flow node" / vertex is no longer a "table". It will consist of 1) a table pointer (reference to a MatchTable object), 2) a small map to determine the next node (indexed by action name or hit / miss). When a MatchTable is applied, it will need to return the action name (if applicable). The action name will be looked up in the map to determine the next table.

It is not a very difficult change, but it's a pretty large one, which requires some good bmv2 knowledge. I assume a bunch of unit tests will need to be updated because of this change. I would highly recommend preserving JSON backward compatibility, so that old JSON files are still supported. Luckily, the new C++ implementation is just more flexible that the old one, it's a superset of the old one. So it's just about supporting 2 different JSON specs at the same time.

fruffy commented 5 months ago

Thanks, this is good to know.

smolkaj commented 5 months ago

This is now also tracked on the BMv2 repo here: https://github.com/p4lang/behavioral-model/issues/1237