p4lang / p4runtime

Specification documents for the P4Runtime control-plane API
Apache License 2.0
144 stars 88 forks source link

Reject multiple updates for the same key, if they are in the same batch #333

Open konne88 opened 3 years ago

konne88 commented 3 years ago

When the switch receives a batch of updates, the P4RT spec says that the switch must behave as if it executed all the updates in a particular order (which order is up to the switch). Therefore, if we have the following batch:

INSERT l3 table entry with key 192.168.1.1 MODIFY l3 table entry with key 192.168.1.1 DELETE l3 table entry with key 192.168.1.1

The switch has to return one of the following status messages: 1,2,3: OK, OK, OK 1,3,2: OK, OK, NOT_FOUND 2,1,3: NOT_FOUND, OK, OK 2,3,1: NOT_FOUND, NOT_FOUND, OK 3,1,2: NOT_FOUND, OK, OK 3,2,1: NOT_FOUND,NOT_FOUND,OK

Note that while some ordering above succeed completely or partially, sending multiple updates for the same key in one batch is always a bug in the controller, because the execution order of the updates always matters for the end result, but is also completely out of control of the controller.

If the controller is "lucky", the switch may execute the operations in the order required by the controller, but this is dangerous, because it is just hiding the bug, and any change to the switch's implementation details can resurface the bug.

To get around this problem, we propose that the switch should always reject all updates which operate on the same key. So the status for the example above would be: INVALID_ARGUMENT, INVALID_ARGUMENT, INVALID_ARGUMENT

We propose to only reject the updates that touch the same keys, so the following batch:

INSERT l3 table entry with key 192.168.1.1 INSERT l3 table entry with key 10.0.0.0 MODIFY l3 table entry with key 192.168.1.1 INSERT l3 table entry with key 10.0.0.1 DELETE l3 table entry with key 192.168.1.1

would return the result: INVALID_ARGUMENT, OK, INVALID_ARGUMENT, OK, INVALID_ARGUMENT

antoninbas commented 3 years ago

I'm fine with this change, but it does add some extra burden to the P4Runtime server, so would like to hear from others such as @bocon13, @ccascone

I assume that this doesn't just apply to table entries, but to all entities for which we have a notion of key (e.g. action profile members)?

stefanheule commented 3 years ago

Actually, this makes it easier to write a P4Runtime server in many cases. I have observed the implementation of two completely independent P4RT servers (in a real switch, not a toy example), and in both cases the current spec was increadibly hard to implement correctly for the developers, but this newly proposed scheme is easy.

Here's why: Most switches have a bunch of layers that the table entries pass through, and they are identified by their key on that path. That means it's hard/impossible to send multiple updates for the same key, so now you have to implement custom logic to not send these in parallel (because normally you want to send your updates in parallel for efficiency reasons). If you can just reject them outright at the top layer (as proposed here), then you can do that filtering quickly and easily, and then send everything in parallel.

ccascone commented 3 years ago

The proposed change makes sense to me, and it shouldn't break any control plane use case. In ONOS, we already avoid sending updates for the same key in the same batch. I would agree with @stefanheule that this change simplifies the server implementation, but I defer to @pudelkoM and @bocon13 for a final comment.

pudelkoM commented 3 years ago

The proposal and reasoning seem sensible.

I'm not sure to what degree Stratum currently is affected by this issue, as we process request updates in sequence and leave parallelism up to the underlying SDE/SDK.

I'm fine with this change, but it does add some extra burden to the P4Runtime server

Pre-rejecting updates would require some extra code and state keeping for us, but probably nothing major.

I guess there could be some pitfalls around comparing non-normalized bytestrings in keys, e.g., if the controller is insane and sends something like "\x00\x01" vs "\x01" for the same table key field.

bocon13 commented 3 years ago

I don't have an issues with this, but agree with @pudelkoM that it would require some additional development in Stratum in the common P4RT server implementation.

Regarding the bytestring issue that you describe, the spec mandates that bytestrings use the shortest string that encodes the value, which means leading 0's are invalid. We don't catch these today in Strartum, but that would be another form of validation that we should probably do (and return the appropriate errors to the client).

https://p4.org/p4runtime/spec/master/P4Runtime-Spec.html#sec-bytestrings

pudelkoM commented 3 years ago

which means leading 0's are invalid

In my understanding RW symmetry is strongly encouraged, but not required. See Table 4. Examples of Valid Bytestring Encoding:

P4 type | Integer value | P4Runtime binary string | Read-write symmetry bit<12> | 99 (0x63) | \x00\x00\x63 | no

As the preceding examples illustrate, a P4Runtime server must accept a wide assortment of possible binary string encodings for the same integer value.

bocon13 commented 3 years ago

@pudelkoM good point. I guess what this means is that the P4RT server implementation would be required to:

  1. Pre-process entries -- transforming them into canonical bytestrings and then comparing keys to filter for duplicates
  2. Send a shorter list to the underlying SDK
  3. Integrate results from pre-processing failures and the underlying SDK responses

We don't do 1 & 3 in any form today, but I don't think it would be a significant burden to implement. As Max said, Stratum applies writes synchronously and sequentially, so we could keep track of this as local variables in the Write() implementation.

I agree that this behavior is undefined today and explicitly disallowing this behavior seems reasonable (and doesn't really impose much of a burden).

konne88 commented 3 years ago

In today's meeting, we had the following realization:

Instead of allowing the switch to execute batches in any order, we can also disallow the controller from sending batches where order matters.

The second is appealing because:

From P4RT's perspective there are only two cases where order matters:

There are also ordering cases which P4RT doesn't know about: