P4_16 Table annotation for 'guaranteed size'

jafingerhut commented 6 years ago

Creating this issue after thinking about this issue this morning, and wondering whether others would be interested in this, too.

There are multiple reasons why high speed P4 table implementations in hardware might not have easily predictable capacities, e.g. because they are hash tables and you can get data-dependent collisions, or because they are TCAM hardware and you have 'range' fields that cause one control plane entry that is added to turn into multiple entries in the hardware, as described a little bit more here: https://github.com/p4lang/p4-spec/blob/master/p4-16/spec/docs/p4-table-and-parser-value-set-sizes.md

There are important practical cases where we can guarantee the capacity of a table, even in the highest performance hardware:

tables that have only exact match key fields, totaling W bits wide, and the table is desired to contain 2^W entries, e.g. the key fields total 10 bits wide, and the developer wants 1024 entries. This can be implemented with an ordinary SRAM table with the 10-bit key as an address into one of 1024 entries. There is no way that adding a key to such a table can fail due to insufficient capacity, hash collisions, or anything else that depends upon the values of the keys.
tables that have only ternary, lpm, and/or exact match_kind's for their key fields, implemented with a normal TCAM in the hardware. By 'normal TCAM' I simply mean one that has an independent value/mask for every entry, and no algorithmic TCAM techniques used that restricts that generality. Such tables can be dimensioned for a given capacity, and there are never any capacity-related reasons to fail adding an entry.

Why might calling out such "guaranteed size" tables in P4 source code be useful? Because in some cases it may simplify the development of controller software if the developer knows that for table X, adds to table X are guaranteed to succeed up to the point of adding N arbitrary entries.

("guaranteed to succeed" here means "will never fail because of reasons of insufficient table capacity, hash collisions, etc." Yes, there can be failures to add due to very rare hardware failure events, but then the device would typically be taken off-line as failed. yes, there can be failures due to failure to communicate between the controller and the device across a network, but that is a fundamental distributed systems issue. This issue is independent of such failure reasons.)

Having such an annotation in the source code would be needed by a P4 compiler, in order to ensure that the correct type of hardware resources were allocated for the table, or to fail the compilation if such resources were not available.

mihaibudiu commented 6 years ago

If tables have an "implementation" property it should probably be used to specify all desired features of the implementation. No need for new annotations.

ghost commented 6 years ago

What are the current semantics of the size keyword? In our programs, we have so far used it to mean:

"From the controller's perspective, size specifies the maximum number of entries that may be safely installed in the table (i.e. may be installed without any errors due to resource exhaustion [1]). From the P4 compiler's perspective, it is the minimum number of entries that the target hardware must provide."

Note that this is a one way implication, i.e. being above the limit does not imply that installation of an entry must fail.

[1] it's ok for there to be some theoretical possibility of failure, as long as it is extremely unlikely. This is similar to how, in many applications, it's ok to assume that two documents with the same sha256 hash are actually equal.

mihaibudiu commented 6 years ago

See this: https://github.com/p4lang/p4-spec/blob/master/p4-16/spec/docs/p4-table-and-parser-value-set-sizes.md

ghost commented 6 years ago

The P4 spec mentions the size keyword, but does not define it: "programmers may be required to define a size table property, which can be used by the compiler back-end to allocate storage resources."

What does p4c do when compiling to Tofino, does it understand the size keyword as the implementation dependent size of a TCAM, SRAM etc?

@mbudiu-vmw can you explain how the doc answers my question? I read over it, but didn't find an answer.

jnfoster commented 6 years ago

Konne,

The latest (draft) version of the spec contains more detail

If a table has a size value specified for it with value N, it is

recommended that a compiler should choose a data plane implementation that is capable of storing N table entries. This does not guarantee that an arbitrary set of N entries can always be inserted in such a table, only that there is some set of N entries that can be inserted. For example, attempts to add some combinations of Nentries may fail because the compiler selected a hash table with O(1) guaranteed search time.

as well as a link to Andy's note. See here: https://p4.org/p4-spec/docs/P4-16-v1.1.0-draft.html#sec-size-table-property

To answer your question, the semantics of size doesn't give the crisp universal guarantees in either direction that you want. That seems inevitable given the characteristics of the underlying hardware, unless the compiler were extremely conservative in reserving extra space for the pessimistic case.

-N

On Tue, Sep 11, 2018 at 1:30 PM Konstantin Weitz notifications@github.com wrote:

The P4 spec mentions the size keyword, but does not define it: "programmers may be required to define a size table property, which can be used by the compiler back-end to allocate storage resources."

What does p4c do when compiling to Tofino, does it understand the size keyword as the implementation dependent size of a TCAM, SRAM etc?

@mbudiu-vmw https://github.com/mbudiu-vmw can you explain how the doc answers my question? I read over it, but didn't find an answer.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/p4lang/p4-spec/issues/667#issuecomment-420354980, or mute the thread https://github.com/notifications/unsubscribe-auth/ABwi0ukD9wgvswr6LaQCt3C59sUN8ZIqks5uZ_MagaJpZM4Wdln5 .

jafingerhut commented 6 years ago

Konstantin, I think that the assumption you are making may not be correct for some P4 implementations. I suspect the truth may be that for some tables, specifying size=4096 will mean that you are fairly unlikely to achieve adding 4096 entries to the table, if they follow the guidelines of the P4_16 spec to the letter.

If you want a table with all exact match fields, and those fields total say 50 bits, and you specify size=4096 entries, there is two ways I know of to guarantee a capacity of 4096 arbitrary entries:

implement the search capability of that table in hardware with a TCAM, with all masks being exact match
implement the search capability with a BCAM (binary CAM), where all masks are restricted to be exact match.

BCAMs that are not also TCAMs in hardware are fairly unusual in networking ASICs, so the first choice would be the most common one.

Another common way to implement such a table is as a hash table with O(1) guaranteed search time, i.e. no software linked list of arbitrary length to resolve collisions. If you have a lot of exact match tables with significant size, these are lower power than area than TCAM hardware, and preferred for those reasons. They are not guaranteed capacity, however.

So what can P4 compiler do? When it sees a P4 table, it must choose some hardware implementation for each table. If both kinds of memories are available, it can make a semi-arbitrary choice, or try to optimize for lower power, or whatever options the compiler writer gives to you to control its decisions. However, if it picks a hash table implementation, you will not have deterministic capacity, and the hash collision probabilities will be much, much higher than cryptographic hashes like SHA file hash collisions in git.

There are many detailed choices there, e.g. allocate 50% more raw capacity in the hardware hash table entries than the size value requested in the P4 source code. Or have that, plus a small overflow TCAM that handles a few dozen collisions. Those can make significant reductions to the cases where you will fail to add table entries. But I think they will be unlikely to make it go anywhere near as low as 1 in 10^50, unless you pick a size parameter in your source code that is 2 or 3 times larger than what you actually plan to add to the table.

Barefoot has perfectly valid business reasons not to discuss details of their products in public forums like this. You should speak to a representative of theirs if you want to know what the Tofino compiler does.

The reason I created this issue is that sometimes an author of control plane software might really, really want a deterministic capacity table, regardless of the power or area cost in the hardware, especially for reasonably small tables, e.g. ones that maps one of 4096 12-bit VLAN ids to another 12-bit VLAN id. You don't want adding an entry to a table like that to fail, ever, and it is pretty cheap in hardware to do that. If you could annotate in your P4 source code: Hey, for this table, I really don't want the control plane to deal with capacity failures to add table entries, it would be good to have a way to specify that.

If you are thinking every table is like that, always, then you may need to re-think your control plane software plans.

ghost commented 6 years ago

Thanks Nate for the answer, and pointing me to the draft version.

I agree that there are scenarios where it's ok to not have any guarantees on a table's size, but there are also many scenarios where it's very important to have some guarantees. For example, one may have some Access Control lists which always have a fixed number of entries, and one needs to have a guarantee that one can always fit them into a table. If one can't install all the entries, that would have severe security implications.

So as Andy is suggesting here, I think it's very important that we have at least some way of specifying that a limit should be strict, e.g. via an annotation or an implementation keyword.

I actually think it would make sense for the default behavior to be that size gives you a guarantee on the entries that you can install. As Andy has pointed out, TCAM and SRAM already provide such guarantees, and the conservative allocation for hashes can be acceptable in many cases.

Andy, I didn't mean to ask Barefoot to disclose how they compile to hardware, I was just curious whether the size was implementation dependent or not, and it seems that it actually is.

jafingerhut commented 6 years ago

@konne-google One thought on the specific syntax that could be proposed here: Add a new optional table property in PSA named something like psa_guaranteed_size that has a value of type boolean.

The default is false, which is the current behavior that the number assigned to size is a "maybe you can reach this number of entries, if you pick the right combination of entries"

The new behavior would occur if you assigned it the value true.

By making the value boolean like this, instead of a number, one need not give the same size number twice, or define what it means if the two numbers are different.

jafingerhut commented 6 years ago

I brought up the question at the 2018-Nov-05 P4 language design work group meeting of whether it would be reasonable to add a guaranteed_size table property to the base P4_16 language, and there were no immediate objections to adding it to the P4_16 language spec, rather than the PSA spec. This seems useful to me, since then it could have a name without 'psa' in it, and could be useful for any P4_16 architecture, not only PSA.

vgurevich commented 6 years ago

Just to make things a little more difficult: does the guaranteed size include the default entry or not?

jafingerhut commented 6 years ago

I would propose that a table with size = 1024; and guaranteed_size = true; would be able to hold 1024 'normal' entries, in addition to a default entry. Yes, that might mean that implementations that use a hardware TCAM entry to implement the default entry for ternary tables will need capacity for 1025 TCAM entries. A savvy P4 programmer who knows their targets well can just change it to size = 1023; if they don't need that one more normal entry.

vgurevich commented 6 years ago

@jafingerhut. Depending on the table it might actually result in allocating 1024 entries, but in other cases it might require twice as much, because depending on the HW and on the specific table 1023 might be a more "natural" size. And then if you decree that 1023 is "natural" then indexed tables go out of the window.

jafingerhut commented 6 years ago

If you have a proposal, please share it. I would be very surprised if we can't come up with something useful and reasonable.

vgurevich commented 6 years ago

My proposal is not to do it, at least not in a mandated way

jafingerhut commented 6 years ago

Do you know how difficult it can be to write control plane software for a VLAN translation table if you cannot guarantee it can handle all 4096 VLANs? And a fair number of other cases I could come up with if I tried? I think something like this has real world utility. If you have never had someone ask you for it before, I can only guess it is because they haven't thought of it, and/or they have gotten lucky that the tables they cared to have this behavior just happened to produce that result from the compiler by accident.

jnfoster commented 6 years ago

Another observation: some people are using P4 as a hardware description language. In that context, I can imagine guaranteed_size would not be fraught with as many issues as @vgurevich is imagining.

vgurevich commented 6 years ago

@jafingerhut , I am totally sympathetic to your viewpoint, although VLAN translation is probably not a very good example, since I do know for a fact that quite a number of very respectable data plane programs do use regular hash tables specifically for VLAN translation :)

And it is not difficult to write the control plane software for this case either as long as the key is 12 bit -- whatever the table size/implementation a good control plane should verify the return code after each entry insertion and must be prepared to deal with any errors that might occur. What is difficult, is having discussions with certain type of customers who would actually attempt to populate all 4096 entries despite the fact that VID 0 and 0xFFF are not supposed to even be used there :) Smarter vendors usually declare their table capacity to be "up to N" entries to avoid this kind of litigation and that's the semantics of "size".

Similarly, I'd prefer not to have conversations explaining why a certain table was allocated to use 2x blocks or why another table has n-1 capacity, etc.

In general, I am worried about the desire to mandate too much -- while the committee can certainly do that, the truth is: if real targets can't do it very well, they will simply either not support the feature or support it with caveats making it unpopular/unusable/very expensive and thus the whole effort will be wasted anyway.

We can also leave the discussion of whether the default entry must or must not count against the total number of entries (it might), but again, that will basically render the strict language kind of useless.

jafingerhut commented 6 years ago

I should be a little more careful in my examples here. You said: "And it is not difficult to write the control plane software for this case either as long as the key is 12 bit -- whatever the table size/implementation a good control plane should verify the return code after each entry insertion and must be prepared to deal with any errors that might occur."

It is easy to write control plane software to handle such add-with-possibility-of-failure for a single table, I agree. If your control plane software wants to make a coordinated change across 5 tables, and succeed on all, or fail on all, then it can be very very convenient if you can predict that tables X and Y of those 5 will never have capacity failures. Yes, I know there are situations where multiple coordinated table updates can be made easy to roll back on capacity failures, too, but some are not so easy to do that for. We may end up disagreeing on whether all control plane software can always be written to avoid that issue (I think sometimes it might be unavoidable).

If a device fails to add entries because of "device has hardware faults" errors, that is a completely different category of problem, and much much rarer, and one you typically handle via taking the entire device off line, not trying to limp along with whatever capacity the device happens to give you.

Anyway, I will tinker around with a proposal for this, and I am still naively optimistic I might be able to explain it in a way that might address your concerns :-)

vgurevich commented 6 years ago

@jafingerhut, the proposed attribute is not a replacement for the proper transaction mechanism in the driver/APIs either. Well designed, professional quality APIs/drivers do provide transaction mechanism to ensure "all-or-nothing" behavior specifically for this kind of situations. It doesn't make it easier to implement such driver either, since the mechanism is typically designed to work with any table(s) anyway.

Again, I do know of several implementations that do this right.

jnfoster commented 1 year ago

In the interest of tidying up our list of issues, I'm going to mark this as "stalled."

p4lang / p4-spec

P4_16 Table annotation for 'guaranteed size' #667