p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
683 stars 446 forks source link

Unique fully qualified names for shared controls/packages #2013

Open amreshk opened 5 years ago

amreshk commented 5 years ago

Consider the below example p4 which shares control across ingress and egress,

#include <core.p4>

parser Parser<P>(
    packet_in buffer,
    out P hdr);

control Control<C>(
    inout C hdr);

control Deparser<D>(
    packet_out buffer,
    out D hdr);

package EgrPipe<P, C, D>(
    Parser<P> ep,
    Control<C> ec,
    Deparser<D> ed);

package IngPipe<P, C, D>(
    Parser<P> ip,
    Control<C> ic,
    Deparser<D> id);

package Top<P, C, D>(
    IngPipe<P, C, D> ingress,
    EgrPipe<P, C, D> egress);

typedef bit<48>  EthernetAddress;
header ethernet_t {
    EthernetAddress dstAddr;
    EthernetAddress srcAddr;
    bit<16>         etherType;
}

parser myP(
    packet_in buffer,
    out ethernet_t eth) {
    state start {
        buffer.extract(eth);
        transition accept;
    }
}

control myD(
    packet_out buffer,
    out ethernet_t eth) {
    apply {}
}

control myC(
    inout ethernet_t eth) {

    action a1() { }
    action a2() { }
    table tbl {
        key = {
            eth.srcAddr : exact;
        }
        actions = { NoAction; a1; a2; }
    }

    apply {
        tbl.apply();
    }
}

IngPipe(myP(), myC(), myD()) ip;
EgrPipe(myP(), myC(), myD()) ep;

Top(ip, ep) main;

Control myC is shared across IngPipe and EgrPipe. The frontend generates a single control node which is attached to both gress causing the same external name in each case on the table tbl -> myC.tbl

Above example can be extrapolated for a multi package scenario sharing the same control across 2 packages.

The issue in both these cases is the names are not unique across the multiple usages for these controls. This makes it impossible to distinguish between these usages for e.g. when control plane needs to address a particular table.

Should the frontend IR duplicate nodes in these cases?

That way it can assign a unique name based on their invocation to create a unique hierarchical name (e.g. ip.myC.tbl & ep.myC.tbl). For instances with no declarations it should pick the defined architecture name (e.g. IngPipe.myC.tbl & EgrPipe.myC.tbl). Similarly the package names would be prefixes for multipackage cases always ensuring a unique name for the resources.

jafingerhut commented 5 years ago

It seems that in a program like this it would be nice to have something like a @name annotation on the two myC() instantiations, each with different names, yes? That is possible when instantiating a control twice within another control already, but I am not aware whether that works within a package instantiation like in your example.

If I could toss out a potential syntax for how that might look, perhaps something like the following?

IngPipe(myP(), @name("myCIngress") myC(), myD()) ip;
EgrPipe(myP(), @name("myCEgress") myC(), myD()) ep;
jafingerhut commented 5 years ago

Addendum: If that idea of a @name annotation within a package instantiation is reasonable, I would propose that the original program should give an error during compilation, at least during the control plane generation portion of compiling, because of the control plane name conflict.

hesingh commented 5 years ago

I like Andy's idea. Note, @name is also used when a table key expression is not simple and the @name annotation with key expression is mandatory.

amreshk commented 5 years ago

@jafingerhut , having an @name annotation does make the names unique. We will have to add the necessary support for the syntax you are suggesting.

If we are going with this approach, this will apply to all such naming collisions including multi package scenarios, e.g. a Package instantiating 2 or more packages,

IngPipe(myP(), @name("myCIngress") myC(), myD()) ip;
EgrPipe(myP(), @name("myCEgress") myC(), myD()) ep;
pkg(ip, ep) p;
Top(@name("myPkg0") p(), @name("myPkg1") p) main;

From my understanding, we have a hierarchicalNames pass https://github.com/p4lang/p4c/blob/master/frontends/p4/hierarchicalNames.h

which generates the unique name annotations based on hierarchy within the program but does not visit package nodes(?). That would need a fix so we can generate unique names across packages, e.g. myPkg0.myCingress.tbl and myPkg1.myCingress.tbl

jnfoster commented 5 years ago

I think there are two issues here:

  1. How these controls should be named by the control plane
  2. Whether the compiler should duplicate the instances in the internal IR (e.g., so that distinct tables, mutable state, etc. are created)

For 1, I don't see why we need a @name annotation. The spec is clear on how control-plane names are calculated (https://p4.org/p4-spec/docs/P4-16-v1.1.0-spec.html#sec-cp-names). In particular,

...if the instance is created as an actual argument, then its local name is the name of the formal parameter to which it will be bound.

So in this program, I believe the first nameless instantiation of myC is named main.ingress.ic and the second is named main.egress.ec and there is no confusion or need for a @name annotation. (Disclaimer: I haven't run this through the compiler to verify, this is only based on reading the spec.)

For 2, I think the compiler should probably create two distinct IR nodes. Although the instances were created from the same control declaration, they really have nothing to do with each other. And so sharing them would not be correct in general.

hesingh commented 5 years ago

@amreshk You are using nested packages which p4c does not support.

jnfoster commented 5 years ago

@hesingh this statement is not correct. For example, PSA uses nested packages and is supported fine by the p4c front-end. In this case, it would taken only a few seconds to run the complete program that @amreshk provided (thanks!!) through p4test and seen that it is accepted just fine.

Please be sure you are right before making definitive statements about the language or the open-source compiler. It can cause confusion for the community to have incorrect statements on GitHub issues and P4 forums.

jafingerhut commented 5 years ago

Replying to this comment of @jnfoster https://github.com/p4lang/p4c/issues/2013#issuecomment-516620782

Hmmm. Yes, I suppose it may be true what you say about what the hierarchical names should be in this code example according to the spec.

I am pretty sure it is not what p4c is doing now for the names. I have never seen 'main' as the first component of a name, for example, in p4c output.

antoninbas commented 5 years ago

@jnfoster shouldn't the names be main.ip.ic and main.ep.ec in this case (and not main.ingress.ic and main.egress.ec) since we do have instance names for the ingress and egress pipes (no need to fallback on the formal parameter names)?

jnfoster commented 5 years ago

@antoninbas correct. My mistake!

jafingerhut commented 5 years ago

If p4c were to be moved closer to the spec in this regard, is there any reason to actually start every hierarchical name with "main."? They would all begin that way, unless I am missing something there.

cc10512 commented 5 years ago

As far as we can tell, the compiler implements a pass, HierachicalNames (https://github.com/p4lang/p4c/blob/5d83957346aef6d0f1abb3f65691744a4738cc3f/frontends/p4/hierarchicalNames.h#L57) that builds the "fully qualified" name of an object taking into account all the nested controls. Unfortunately, that is not enough, as it does not also consider the hierarchy of packages. And while controls do get inlined into other controls and thus guarantee that the instances are replicated, we do not believe that the replication is done for the package hierarchy. I have not seen a test in the p4c test suite that actually has such a package hierarchy, maybe for the obvious reason that none of the architectures defined in p4c needed such a feature. But certainly the language allows it and the compiler "supports" it.

I agree with @jnfoster that there should be no need for additional name annotations: there is a clear definition of what the names should be.

hesingh commented 5 years ago

@jnfoster Sorry, wrong use of terminology ("nested") on my part. It's instantiation. To confirm language parsing, I always refer to the language parser in p4c/frontends/parser/p4/p4parser.ypp.

Obviously, EgrPipe, IngrPipe, and Top are packages.

The instantiations are: ip IngPipe ep EgrPipe main Top

This is why the names will be main.ip.x and main.ep.y.

Also, sometimes p4test may just run but not be functionally correct. Additionally, if one has such a P4 program one should also use p4test with p4runtime generation to catch any obvious problem. See my test below which fails p4runtime generation.

$./p4test --std p4-16 foo.p4 --p4runtime-files tmp.txt

[--Werror=duplicate] error: Name 'myC.tbl' is used for multiple table objects in the P4Info message
[--Werror=duplicate] error: Found 1 duplicate name(s) in the P4Info
$
jafingerhut commented 5 years ago

@cc10512 Just a minor qualification of something you said, in case it is not widely known (or perhaps I've got this wrong, and could use correcting myself here).

It is possible to write P4_16 code that instantiates a control instance c_inner inside of a control c_outer, and then pass c_inner to multiple other control instantiations within c_outer as a constructor parameter (i.e. at compile time), and in that case c_inner should not be replicated, but the same instance "used" multiple times.

That doesn't mean that any given implementation must support arbitrary programs like that, especially if c_inner contains a table and it is apply'd from every one of multiple other controls it is passed as a constructor parameter to.

If c_inner doesn't have any tables, and it contains only externs such that the implementation supports "using" those externs more than once per single time of calling a top level control, an implementation could support that situation without any troubles I can see.

It does raise the question of: if such externs inside of c_inner have a control plane interface and thus control plane name, should they have multiple names by which they can be called in the control plane API, one for each "path" from the top level package to where it is used? Or should there be a simple rule to determine one unique name, e.g. perhaps the shortest one, by which c_inner was first instantiated?

hesingh commented 5 years ago

@amreshk You are using nested packages which p4c does not support.

Please see the reply from @cc10512. Clearly, the code URL he sent out does not have a preorder(IR::PackageBlock* b) override. Even p4runtime is not working for your P4 program. We should collect list of all issues to take stock. Thanks for submitting the P4 program.

cc10512 commented 5 years ago

@hesingh that's the whole point of opening this issue -- the compiler does not replicate the controls properly and uses the same instance.

@jafingerhut, after re-reading the spec (thank you for the table in Appendix F), I think that indeed we need to dig deeper into the instantiation rules. And obviously, the answer includes the obligatory "it depends" -- in this case on the architecture and what objects are allowed to be called multiple times, whether the paths through the sequence of applies are mutually exclusive, etc.

I'm afraid I'm a bit tainted here, and I would say that with the current implementations, including bmv2, which allows tables be applied only once, the scenario that you're presenting might be legal or not. If an object in the control instance can not be applied more than once, and the compiler can not guarantee mutual exclusive paths, the compiler should reject the program. If the objects can be applied multiple times, I would agree with you that the control plane name should be the same. However, I'm not sure that this holds in the compiler right now. What I am sure about though, is that you will certainly try it ASAP :)

jafingerhut commented 5 years ago

@cc10512 As maybe a more concrete example of something you might think worth supporting, consider a control c_outer that instantiated an action selector, calling the instantiation by the name as1.

Then c_outer instantiated 3 sub-controls with local instance names c_inner1, c_inner2, and c_inner3, passing as1 as a constructor parameter to each of them. Each of them has its own unique table, but all of those tables have implementation = as1;.

I believe if we 'flattened' that source code by hand so that all 3 tables were directly defined within c_outer, you would want to allow that.

It might not be worth the effort to support that same scenario with the 3 inner controls, but in some senses it is an equivalent program.

hesingh commented 5 years ago

As far as we can tell, the compiler implements a pass, HierachicalNames (

https://github.com/p4lang/p4c/blob/5d83957346aef6d0f1abb3f65691744a4738cc3f/frontends/p4/hierarchicalNames.h#L57

) that builds the "fully qualified" name of an object taking into account all the nested controls. Unfortunately, that is not enough, as it does not also consider the hierarchy of packages. And while controls do get inlined into other controls and thus guarantee that the instances are replicated, we do not believe that the replication is done for the package hierarchy. I have not seen a test in the p4c test suite that actually has such a package hierarchy, maybe for the obvious reason that none of the architectures defined in p4c needed such a feature. But certainly the language allows it and the compiler "supports" it. I agree with @jnfoster that there should be no need for additional name annotations: there is a clear definition of what the names should be.

As far as I can tell, the HierarchicalNames pass is only tweaking the @name annotation and replication of control to a new IR object is not done in the pass. Adding package support to this pass will fix the @name annotation tweaking, but not create a new IR control object.

hesingh commented 5 years ago

I changed p4c for its HierarchicalNames frontend pass to support package. Then, I changed P4 code to add the following @name annotations to package instantiations:

@name("Ing") IngPipe(myP(), myC(), myD()) ip;
@name("Egr") EgrPipe(myP(), myC(), myD()) ep;

@name("Top") Top(ip, ep) main;

Then the IR-generated and converted to a P4 program shows the transformation by the pass. As one can see ip.Ing and ep.Egr are generated. To tack ".main" before each will need more tweaks to p4c.

@name("ip.Ing") IngPipe<ethernet_t, ethernet_t, ethernet_t>(myP(), myC(), myD()) ip;
@name("ep.Egr") EgrPipe<ethernet_t, ethernet_t, ethernet_t>(myP(), myC(), myD()) ep;
@name("main.Top") Top<ethernet_t, ethernet_t, ethernet_t>(ip, ep) main;

Now, the operative question is where does one generate the new control object in IR. Right now, the HierarchicalNames pass runs towards the end of frontend passes. So, one place I see generation is in midend.

hesingh commented 5 years ago

The duplication of control needs to happen before the HierarchicalNames frontend pass is invoked. The reason is because HierarchicalNames pass is run towards the end of frontend processing. The p4c midend should not deal with packages and names.

I changed p4c to create a new control in IR with a name suffixed with "_2". The control's annotation is also edited to help the HierarchicalNames pass. I used toP4 to dump the IR as P4 early in frontend processing. See the attached output.txt file. I also generated p4runtime's p4info successfully. The p4info.txt file is also attached.

output.txt p4info.txt