Open AltraMayor opened 3 years ago
The BPF program must be written in such a way that it prioritizes the longest matching filter. See motivation for this in issue #462.
Do ntuple filters support longest matching filters? Or do they require the use of priority to do so? This is relevant information that must be identified before doing the integration of them with a filter subsystem that will support the longest matching rules. If ntuple filters don't match the longest rules, this new subsystem could solve the problem by calculating the priorities of the rules. The priorities can be calculated by having a binary tree as described above that includes the ntuple filter rules, and the BPF rules. The BPF rules are needed in this tree to make sure that there's no conflict between ntuple and BPF rules. For example, if a ntuple rule only matches field F1, and a BPF rule matches fields F1 and F2, the ntuple rule must be implemented as a BPF rule.
The new solution should also fold in Ethernet Type filters. This will remove awareness of any particular type of filter from the GK and GT blocks.
Having struct rte_flow_item
as the input for the pattern to be matched will make it easier to use this BPF library since it's the same interface for DPDK's generic flow interface. That is, if a flow cannot be implemented in hardware, falling back would be a consecutive call to this library with the same parameters.
Solving this issue makes issue #466 a low-hanging fruit since adding calls to DPDK's flow interface would require little.
When a NIC doesn't support filters that a functional block needs, the block falls back to register the needed filters through an internal API that essentially uses ACL to do the matching. However, the ACL library doesn't account for variable-length headers such as Ethernet, IP, and TCP. Thus, the internal API requires blocks to register ACL filters and a function to deal with variable headers. This solution duplicates tests and complicates the code of the blocks.
A solution is for the internal filter API to receive the needed filters and classify the packets with the help of a BPF program created on the fly. For example, IPv4 filters would be the following tuples (destination IP address, next protocol, source port number, destination port number). Each field of this tuple should be optional. This internal filter API would organize all the registered tuples in a binary tree as follows:
The construction above allows one to navigate the binary tree and easily generate BPF bytecode to classify packets: following right edges lead to sequences of
if
statements, while following left edges lead to what goes inside of the previousif
statements. Optional fields can be easily dealt with by placing the corresponding tuples at the end of the list of right edges of a given level.The problem of variable headers can be solved by making the running environment of the generated BPF program similar to the environment used for BPF programs used to enforce policy decisions: it includes the sizes of the headers of the packets. The sizes of the headers are already available when the filter API receives the packets for classification.
In order to support different node types to support IPv4 and IPv6 with the same code of the library, the library must support the nodes being defined with functions like
bool is_field_option(int field_number)
. The API may even generate a single BPF program to classify IPv4 and IPv6 packets all at once if this is convenient by combining the output of IPv4 and IPv6 tuples. Another approach is to identify the fields of the nodes as the tuple (header level, field bit offset, field bit length, field bit mask in network order, field value in network order). For IPv4, the following tuple tests the version of the header (L3, 0, 4, 0xF, 4).Given that the proposed solution would have a broad impact on the code and may interact with DPDK's Generic flow API (rte_flow), this issue should only be addressed after issue #294 is closed.
Although the new API itself would have a complicated code, it would simplify the code from the perspective of the blocks, avoid duplicate tests on packets to be classified, and likely be faster overall.