roop-nvda commented 2 years ago

Introduction

There are several use cases for stateful tables in the modern data center networks. These require a table to add an entry when missed, only when certain conditions (such as start of connection) are true and remove an entry when some other conditions indicate that the entry is not needed any more (e.g. end of connection). The PNA spec [1] has defined add_on_miss tables in section 8.1 to allow P4 tables to add entries.

Problem

One commonly cited use case for stateful tables is TCP connection tracking, where we would like an entry to be added when an outgoing TCP SYN is seen, and the entry removed when a TCP FIN is seen. There are several proposals [2][3] that use the PNA defined add_on_miss tables to meet this use case. They use conditional clauses in the match table actions to achieve the selective addition and removal of entries from the add_on_miss table.

However, section 13.1 of the p4_16 language specification [4] mentions portability considerations for switch statements and conditionals in actions – “The body of an action consists of a sequence of statements and declarations. No switch statements are allowed within an action—the grammar permits them, but a semantic check should reject them. Some targets may impose additional restrictions on action bodies—e.g., only allowing straight-line code, with no conditional statements or expressions.”.

In our opinion this restriction is indeed valid for several real world target architectures. As such, we would like to explore further ways to define add_on_miss tables, so that the use cases like the TCP connection tracking could be expressed clearly by the P4 programmer and implemented efficiently in a wide variety of network adapters.

Solution

One of the options to solve this problem is to expand the functionality of the proposed stateful tables externs in PNA.

Add on miss if

For network adapters that do not support conditionals in match actions, an approach that we can consider is adding a check to the extern to add entry, such that a comparison of one or more packet fields is performed inside the extern before inserting the entry in the table. In other words – add entry only if the field matches the value.

extern bool add_entry_if<D, T>(in D data, in D value, in string action_name, in T action_params);

When data == value is true, the return value should be true if the entry was added, false otherwise. When data == value is false, the return value should be false.

For example, to add entry upon receiving TCP SYN, an implementation could call add_entry_if and set the data argument to be the TCP SYN flag and value to be true. Another implementation could use a prior match action table to set a program boolean variable, e.g., do_add. The program can then call the proposed method above as follows: add_entry_if(do_add, true, ..) to add an entry upon miss to an add_on_miss table.

To remove an entry in a timely manner (such as reception of a middle of the connection packet vs. FIN packet in TCP connection tracking) we could set the value of expiry time of the corresponding matched entry to a value only if a field matches the given value.

extern void set_entry_expire_time_if<D>(in D data, in D value, in ExpireTimeProfileId_t expire_time_profile_id);

A more flexible comparison is possible if we use a tuple of fields and values as the first two arguments above.

References

[1] PNA -v 0.5.0.pdf (p 4.org ) version 0.5.0 [2] pna /pna -example -tcp -connection -tracking .p 4 at main · p 4 lang /pna · GitHub [3] DASH /sirius _conntrack .p 4 at main · Azure /DASH · GitHub [4] P 4~16~ Language Specification version 1.2.2.``

jfingerh commented 2 years ago

@roop-nvda Thanks for the comment.

Your comment suggests adding an extern function with signature extern bool add_entry_if<D, T>(in D data, in D value, in string action_name, in T action_params); and the behavior if (data == value) { add_entry(action_name, action_params); }

My PR currently suggests adding an extern function with signature (the text below is copied and pasted from lines added by the current PR as it was after the 1st commit, which I mention in case more commits are added to this PR later):

extern AddEntryErrorStatus_t add_entry_if<T>(
    in bool do_add_entry,
    string action_name,
    in T action_params,
    in ExpireTimeProfileId_t expire_time_profile_id);

with behavior

    (expr) ? add_entry(action_name, action_params, expire_time_profile_id)
            : ADD_ENTRY_NOT_DONE;

As discussed in last week's meeting, it is a bit more general to have a single new bool parameter rather than two type T parameters, where the condition must be data == value, because then an implementation might be able to support either the condition data == value, or other conditions. The complexity of the conditions a target implements might differ from target to target, and while this is not ideal for P4 developers if this varies across target devices, it does give more generality to the supported expressions.

roop-nvda commented 2 years ago

Thanks for your remarks @jfingerh.

The first argument that you propose for add_entry_if is a Boolean. I had read it to mean a p4 elementary type Boolean. From your comment above however, I think you mean the first argument can be an expression that evaluates to a Boolean. I may me missing something, but I am not seeing Boolean cast-able to an expression in the P4 language spec. Can you point me to the relevant section?

Additionally, one of the original motivations of proposing add_entry_if, was to support minimal conditional adds (by means of an extern) in time and compute constrained targets, where a single field could be compared easily, but not much more. Passing an expression, that may in theory be made up of several other expressions, while being indeed more general, is harder for such targets to implement in our opinion. As such we think that the Data filed as first argument is flexible to convey some important conditions while being simple enough to evaluate in a miss action. For example, in the p4 program you have cited, the metadata header can be set by previous tables, and the add on miss table's miss action could capture the condition as -- add_entry_if(meta.other_table_result, true ..).

jfingerh commented 2 years ago

For example, these are all legal call to add_entry_if according to the P4_16 language spec, but some targets could reject some of these for target-specific reasons:

    add_entry_if(true, ...);
    add_entry_if(do_add_on_miss, ...);  // do_add_on_miss is a local variable with type bool
    add_entry_if(hdr.tcp.flags[4:4] == 1, ...);    // #3 Roop believes that BMv2 back end might have difficulty with this.
    add_entry_if((hdr.tcp.flags[5:5] + hdr.tcp.flags[4:4]) == 1, ...);
    add_entry_if(count < 50, ...);

AI Andy: Create a little test program for BMv2 back end that exercises all 5 of these and see whether it handles it, and if so, how. I believe it does. Will use one of the update_checksum or verify_checksum extern functions in BMv2 v1model architecture that already take a bool parameter as first argument, similar to add_entry_if. Add test program and results or link to a comment on this PR.

jfingerh commented 2 years ago

Here is a p4c test program that has been used to test p4c for several years, and runs on the BMv2 simple_switch/simple_switch_grpc software switch: https://github.com/p4lang/p4c/blob/main/testdata/p4_16_samples/v1model-special-ops-bmv2.p4#L400

The linked call to verify_checksum has a first parameter which is this expression: hdr.ipv4.isValid() && hdr.ipv4.ihl == 5

If you look in the header file v1model.p4, you can see that the definition of verify_checksum's signature is as follows: https://github.com/p4lang/p4c/blob/main/p4include/v1model.p4#L483

extern void verify_checksum<T, O>(in bool condition, in T data, in O checksum, HashAlgorithm algo);

In fact, I believe that having a first parameter with type bool that conditionalizes whether the extern function does something, versus doing nothing, might have originated with verify_checksum and the similar update_checksum extern functions in the v1model architecture.

I would guess that the original motivating reason for having this parameter to verify_checksum and update_checksum was for a similar reason: the desire not to make p4c support recognizing patterns like if (cond1) verify_checksum(...);. However, that implementation slightly precedes my involvement with P4, so I might be wrong on the reason for that implementation choice.

If you compile that test program using a command like p4c --target bmv2 --arch v1model v1model-special-ops-bmv2.p4 it generates a BMv2 JSON file. When I compile it with a recent version of p4c, this section of the BMv2 JSON file defining a BMv2-JSON-checksum-thingy named cksum_0 appears. If you look at the value of the key "if_cond" you can see the BMv2 JSON representation of the condition hdr.ipv4.isValid() && hdr.ipv4.ihl == 5:

    {
      "name" : "cksum_0",
      "id" : 1,
      "source_info" : {
        "filename" : "v1model-special-ops-bmv2.p4",
        "line" : 400,
        "column" : 8,
        "source_fragment" : "verify_checksum(hdr.ipv4.isValid() && hdr.ipv4.ihl == 5, ..."
      },
      "target" : ["ipv4", "hdrChecksum"],
      "type" : "generic",
      "calculation" : "calc_0",
      "verify" : true,
      "update" : false,
      "if_cond" : {
        "type" : "expression",
        "value" : {
          "op" : "and",
          "left" : {
            "type" : "expression",
            "value" : {
              "op" : "d2b",
              "left" : null,
              "right" : {
                "type" : "field",
                "value" : ["ipv4", "$valid$"]
              }
            }
          },
          "right" : {
            "type" : "expression",
            "value" : {
              "op" : "==",
              "left" : {
                "type" : "field",
                "value" : ["ipv4", "ihl"]
              },
              "right" : {
                "type" : "hexstr",
                "value" : "0x05"
              }
            }
          }
        }
      }
    }

roop-nvda commented 2 years ago

I am not sure the verify_checksum example captures the limitations of the table miss actions we are trying to address with add_on_miss_if. My understanding is that verify_checksum is expected to be called as part of parser/deparser blocks, and parser blocks already expect complex conditional expressions.

In the PSA spec, table 5, and in the PNA spec, table 2, specify that the checksum externs can only be called from the parser and deparser blocks.

Some thinking behind this was kindly captured by @jfingerh here https://github.com/p4lang/p4-spec/issues/360:

All (or at least most) checking of header checksums is expected to be performed in the parsers, setting some kind of result for use in the following Ingress or Egress control block, such as the parser error. The Ingress and Egress control blocks are expected to check for parser errors, if desired, and then also perform the desired actions on packets that experienced errors during parsing.

In v1model.p4, the checksum externs are only callable from checksum controls. https://github.com/p4lang/p4c/blob/6d16a1867e5e3252a6524e2d584bd2f861126109/p4include/v1model.p4#L467 The checksum controls are defined as before and after ingress and egress pipelines of the v1switch. https://github.com/p4lang/p4c/blob/6d16a1867e5e3252a6524e2d584bd2f861126109/p4include/v1model.p4#L760

I tried inserting a verify_checksum call in an action block, and it does not compile with target bmv2; but the same call compiles fine in the parser block.

jafingerhut commented 2 years ago

@roop-nvda Check out the attached P4 program, which invokes this action from a table:

    action update_ingress_pkt_count () {
        per_inport_protocol_counter.count(
            ((IngressProtoCounterIndex_t) stdmeta.ingress_port << LOG2_NUM_PROTO_IDS) +
            (IngressProtoCounterIndex_t) proto_id);
    }

In the BMv2 JSON file, it appears that this expression is assigned to a temporary variable first in the representation of the action, and then that temporary variable is passed as the parameter to the count method. That introduction of a temporary variable might be simply an implementation detail of the p4c BMv2 back end -- I am not sure. indexed-counter-update-example.p4.txt

jfingerh commented 1 year ago

@roop-nvda Are the changes in this PR still desired? I will add it to the list of topics for live discussion in PNA meetings.

roop-nvda commented 1 year ago

The PR looks good to me. Thanks @jfingerh fingerh!

shirshyad commented 1 year ago

Other than modifying description as stated in https://github.com/p4lang/pna/pull/43#discussion_r1004575822 the PR LGTM

Thank you answering Qs and addressing comments. PR is approved now assuming some extern description clarification will be done before merge.

jafingerhut commented 1 year ago

@shirshyad I added another commit to this PR that attempts to address your comments for clarifying what the restrictions on the action added by an add_entry() call are allowed to be. I am going ahead and merging this since it has a couple of approvals, but please do consider reading my modified version of those comments for add_entry to see if it addresses your concerns, and file an issue if it does not.

p4lang / pna

Add proposed extern function add_entry_if() #43

Introduction

Problem

Solution

Add on miss if

References