github / glb-director

GitHub Load Balancer Director and supporting tooling.
Other
2.37k stars 227 forks source link

Add port range support in XDP director #135

Closed pavantc closed 2 years ago

pavantc commented 2 years ago

Add port range support in XDP director.

Enables port range tests for XDP director too. Skips IP range tests since that is not yet supported in XDP director.

pavantc commented 2 years ago

Tests pass (though it would have been nice to return/display a skipped status for the individual tests since the ip_range tests for XDP have been skipped in reality)

DPDK:

%%%FOLD {Running scapy packet tests against glb-director}%%%
test_cli_tool.TestGLBBinaryCLI.test_generate_configs ... ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_01_ip_range_match_v4 ... ......ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_02_ip_range_no_match_v4 ... ....ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_03_port_range_match_v4 ... .......ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_04_port_range_no_match_v4 ... ....ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_01_ip_range_match_v6 ... ......ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_02_ip_range_no_match_v6 ... ....ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_03_port_range_match_v6 ... .......ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_04_port_range_no_match_v6 ... ....ok
test_director_classify_v4.TestGLBClassifyV4.test_01_route_classified_v4 ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_03_icmp_echo_request ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_04_reload_and_unhealthy_primary ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_01_route_classified_v6 ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_03_icmp_echo_request ... .ok
test_director_hash_fields.TestGLBHashFieldsMigration.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddr.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddrAndPort.test_01_route_classified_v4 ... ...ok
test_director_kni.TestGLBKNI.test_01_nic_rx_to_kni ... .ok
test_director_kni.TestGLBKNI.test_02_kni_to_nic_tx ... .ok
GLBRendezvousTable correctly orders hosts (index=0x0000) ... ok
GLBRendezvousTable correctly orders hosts (index=0xbb44) ... ok
GLBRendezvousTable correctly orders hosts (index=0xffff) ... ok
GLBRendezvousTable correctly calculates valid row seeds ... ok

XDP:

%%%FOLD {Running scapy packet tests against glb-director XDP}%%%
test_cli_tool.TestGLBBinaryCLI.test_generate_configs ... ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_01_ip_range_match_v4 ... ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_02_ip_range_no_match_v4 ... ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_03_port_range_match_v4 ... .......ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_04_port_range_no_match_v4 ... ....ok
test_director_classify_v4.TestGLBClassifyV4.test_01_route_classified_v4 ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_03_icmp_echo_request ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_04_reload_and_unhealthy_primary ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_01_route_classified_v6 ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_03_icmp_echo_request ... .ok
test_director_hash_fields.TestGLBHashFieldsMigration.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddr.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddrAndPort.test_01_route_classified_v4 ... ...ok
test_director_metrics.TestGLBDirectorMetrics.test_01_route_classified_increments_metrics ... .ok
GLBRendezvousTable correctly orders hosts (index=0x0000) ... ok
GLBRendezvousTable correctly orders hosts (index=0xbb44) ... ok
GLBRendezvousTable correctly orders hosts (index=0xffff) ... ok
GLBRendezvousTable correctly calculates valid row seeds ... ok
pavantc commented 2 years ago

There - skipped status:

XDP

%%%FOLD {Running scapy packet tests against glb-director XDP}%%%
test_cli_tool.TestGLBBinaryCLI.test_generate_configs ... ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_01_ip_range_match_v4 ... SKIP: IP ranges not supported in XDP
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_02_ip_range_no_match_v4 ... SKIP: IP ranges not supported in XDP
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_03_port_range_match_v4 ... .......ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_04_port_range_no_match_v4 ... ....ok
test_director_classify_v4.TestGLBClassifyV4.test_01_route_classified_v4 ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_03_icmp_echo_request ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_04_reload_and_unhealthy_primary ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_01_route_classified_v6 ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_03_icmp_echo_request ... .ok
test_director_hash_fields.TestGLBHashFieldsMigration.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddr.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddrAndPort.test_01_route_classified_v4 ... ...ok
test_director_metrics.TestGLBDirectorMetrics.test_01_route_classified_increments_metrics ... .ok
GLBRendezvousTable correctly orders hosts (index=0x0000) ... ok
GLBRendezvousTable correctly orders hosts (index=0xbb44) ... ok
GLBRendezvousTable correctly orders hosts (index=0xffff) ... ok
GLBRendezvousTable correctly calculates valid row seeds ... ok

DPDK:

%%%FOLD {Running scapy packet tests against glb-director}%%%
test_cli_tool.TestGLBBinaryCLI.test_generate_configs ... ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_01_ip_range_match_v4 ... ......ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_02_ip_range_no_match_v4 ... ....ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_03_port_range_match_v4 ... .......ok
test_director_classify_ranges_v4.TestGLBClassifyRangesV4.test_04_port_range_no_match_v4 ... ....ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_01_ip_range_match_v6 ... ......ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_02_ip_range_no_match_v6 ... ....ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_03_port_range_match_v6 ... .......ok
test_director_classify_ranges_v6.TestGLBClassifyRangesV6.test_04_port_range_no_match_v6 ... ....ok
test_director_classify_v4.TestGLBClassifyV4.test_01_route_classified_v4 ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_03_icmp_echo_request ... .ok
test_director_classify_v4.TestGLBClassifyV4.test_04_reload_and_unhealthy_primary ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_01_route_classified_v6 ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_02_icmp_fragmentation_required ... .ok
test_director_classify_v6.TestGLBClassifyV6.test_03_icmp_echo_request ... .ok
test_director_hash_fields.TestGLBHashFieldsMigration.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddr.test_01_route_classified_v4 ... ...ok
test_director_hash_fields.TestGLBHashFieldsSourceAddrAndPort.test_01_route_classified_v4 ... ...ok
test_director_kni.TestGLBKNI.test_01_nic_rx_to_kni ... .ok
test_director_kni.TestGLBKNI.test_02_kni_to_nic_tx ... .ok
GLBRendezvousTable correctly orders hosts (index=0x0000) ... ok
GLBRendezvousTable correctly orders hosts (index=0xbb44) ... ok
GLBRendezvousTable correctly orders hosts (index=0xffff) ... ok
GLBRendezvousTable correctly calculates valid row seeds ... ok
pavantc commented 2 years ago

@theojulienne added some checks, also introduced a header for limits so that we can name them rather than using magic numbers and so that it is easy to share between golang and bpf. Haven't changed them all in the bpf_map_def structures, but plan to cut another PR for that.

pavantc commented 2 years ago

@theojulienne added a test locally to see that the limit is hit by creating another test class in the v4 ranges test file. It is hit but log.Fatal() kills the xdp director and the rest of the tests can't continue. Thought I'll add some tunables to make the log level less severe during testing so that this can be tested but also looked like overkill. Happy to hear your comments on that. (Plan was to introduce another flag (like --debug which tells this is test mode and maybe also accepts a number of args based on which certain behavior could be changed in the code).

theojulienne commented 2 years ago

It is hit but log.Fatal() kills the xdp director and the rest of the tests can't continue.

Is there a way we can relaunch it and clean up in this case? It seems like it would be useful for tests to not break each-other if we can do so.

It might also be worth considering how this will work in a production scenario, e.g. if someone adds binds that take this over the limit, it might make sense to stop the reload from working, but will that leave the existing binds up? Will there be an easy way to tell? Should there be some validation pre-run or similar?

pavantc commented 2 years ago

It is hit but log.Fatal() kills the xdp director and the rest of the tests can't continue.

Is there a way we can relaunch it and clean up in this case? It seems like it would be useful for tests to not break each-other if we can do so.

It might also be worth considering how this will work in a production scenario, e.g. if someone adds binds that take this over the limit, it might make sense to stop the reload from working, but will that leave the existing binds up? Will there be an easy way to tell? Should there be some validation pre-run or similar?

Yeah, one thought I had was to run through the binds once and just count the number of binds after expanding the ranges before adding anything to the map and make sure it is within the supported number of binds. Then, we basically invalidate the new config and retain the current active one. But we then need a way to report this has happened.

One other possibility is to make sure that the forwarding table does not have more than the supported number of binds even before it is pushed to the directors. That way, we don't have to deal with the binds limit here at runtime.