p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
679 stars 443 forks source link

Expanding structs into lists for BMv2 seems to have broken clone3 #2875

Closed smolkaj closed 2 years ago

smolkaj commented 3 years ago

I believe https://github.com/p4lang/p4c/pull/2781 broke the clone3 primitive.

E.g., consider the following code:

  clone3(CloneType.I2E, pre_session, {
    local_metadata.mirroring_src_ip,
    local_metadata.mirroring_dst_ip,
    local_metadata.mirroring_src_mac,
    local_metadata.mirroring_dst_mac,
    local_metadata.mirroring_ttl,
    local_metadata.mirroring_tos
  });

BMv2 config before:

"field_lists" : [
    ...
    {
      "id" : 3,
      "name" : "fl_1",
      "source_info" : {
        "filename" : "<redacted>",
        "line" : 86,
        "column" : 45,
        "source_fragment" : "{ ..."
      },
      "elements" : [
        {
          "type" : "field",
          "value" : ["scalars", "userMetadata._mirroring_src_ip9"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "userMetadata._mirroring_dst_ip10"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "userMetadata._mirroring_src_mac11"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "userMetadata._mirroring_dst_mac12"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "userMetadata._mirroring_ttl13"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "userMetadata._mirroring_tos14"]
        }
      ]
    }
  ],

After:

"field_lists" : [
    ...
    {
      "id" : 3,
      "name" : "fl_1",
      "source_info" : {
        "filename" : "<redacted>",
        "line" : 86,
        "column" : 45,
        "source_fragment" : "{ ..."
      },
      "elements" : [
        {
          "type" : "field",
          "value" : ["scalars", "tmp_10"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "tmp_11"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "tmp_12"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "tmp_13"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "tmp_14"]
        },
        {
          "type" : "field",
          "value" : ["scalars", "tmp_15"]
        }
      ]
    }
  ],

I will try to make this more reproducible, but I was hoping that an expert like @mbudiu-vmw or @antoninbas might be able to see if there is an issue or not even with what I have shared so far.

smolkaj commented 3 years ago

Please find a small program reproducing the issue attached. issue_2875.p4.txt

To elaborate on my initial report, I believe that the issue is that p4c introduces new temporaries, causing BMv2 to preserve only those temporaries (but not the original fields) when cloning with clone3. This will brake code that accesses the original fields in the clone, since they will be 0.

mihaibudiu commented 3 years ago

This is actually a fundamental bug in v1model: https://github.com/p4lang/p4c/issues/1514 The semantics of P4-16 in fact does not allow such operations to be implemented at all. The fact that they worked was an accident of the implementation. The compiler should give warnings if you attempt to use them. We have attempted to fix them in various ways, by changing the architecture definition (e.g., https://github.com/p4lang/p4c/pull/1698), but in the end the agreement is that the solution is to migrate to PSA.

mihaibudiu commented 3 years ago

But, of course, PSA is not ready for consumption yet, even two years later.

smolkaj commented 3 years ago

Thanks for the explanation and pointers, Mihai.

Do you have any suggestions what we might do in the meantime? Is there some kind of (imperfect) workaround, either in the program or one we could easily implement in the compiler?

smolkaj commented 3 years ago

The fact that they worked was an accident of the implementation.

How hard would it be keep this accident around for the moment? Would that be feasible?

mihaibudiu commented 3 years ago

Then you just need to revert the patch you pointed to. I don't know how many patches after that depend on those changes.

The fact that they worked was an accident of the implementation.

How hard would it be keep this accident around for the moment? Would that be feasible?

mihaibudiu commented 3 years ago

This is actually very unpleasant. This is really a fundamental bug in the design of v1model which we didn't catch early enough. I re-read the solution proposed in https://github.com/p4lang/p4c/pull/1698, and I think it is actually a decent solution. But it certainly it is not backwards compatible. If you need the functionality of clone3/resubmit/recirculate and you are willing to rewrite some of your programs we could do the following:

In fact, I do not see many downsides to doing this. The design committee decided not to fix this bug on the assumption that PSA would take over v1model. But perhaps we should no longer wait for PSA and do this instead.

mihaibudiu commented 3 years ago

To summarize the solution proposed in #1698, you would annotate metadata fields that you may want to preserve with annotations like @field_list(1). Then in the recirculate/resubmit/clone calls the third argument is not the list of fields but just the number you used in the annotation (1). That's all you need to do. For the fix we would create new functions recirculate_list/resubmit_list/clone3_list functions in v1model.

jnfoster commented 3 years ago

It would be great to discuss this at the September LDWG meeting if Steffen can attend that.

smolkaj commented 3 years ago

Thanks for all the background info and suggestions, I really appreciate it @mbudiu-vmw.

My only worry is that we would need a solution that works with mainline p4c, since forking/using an outdated version of p4c is not a viable long term strategy for us.

@jnfoster, yes, it would be great to discuss this at the LDWG metting. I'm planning to attend more regularly again going forward. Currently the next meeting is planned for Labor Day -- I am assuming it will be reschedule to not be on a holiday?

mihaibudiu commented 3 years ago

I don't see many downsides adding this extension to the standard p4c-bm2-ss and to v1model.

smolkaj commented 3 years ago

For future reference, the current state of affairs is also documented here: https://github.com/p4lang/behavioral-model/blob/main/docs/simple_switch.md#restrictions-on-recirculate-resubmit-and-clone-operations

smolkaj commented 3 years ago

Also for future reference, I think one can use clone + recirculate({}) (+ extra bookkeeping) to implement clone3.

Maybe there is a nice theorem to be proven here.

mihaibudiu commented 3 years ago

https://github.com/p4lang/p4c/pull/2902 is an attempt to address this issue again. Please comment on the PR if you find it acceptable. Perhaps we should not change the signature of existing functions in v1model, but add new ones.

jafingerhut commented 2 years ago

2902 was merged in on 2021-Dec-06, after being reviewed by Steffan Smolka and others interested in this issue. I am closing this now, but feel free to reopen it, or open a new issue, if you believe there are any aspects of it that have not been addressed.