facebookincubator / katran

A high performance layer 4 load balancer
GNU General Public License v2.0
4.75k stars 504 forks source link

fix map name #200

Closed tehnerd closed 1 year ago

tehnerd commented 1 year ago

to be able to restart katran's bpf code in runtime (e.g. to load bpf code w/ introspection enabled w/o restarting binary) all maps must have name w/ size less than 15 chars (because kernel truncates map name to 15 chars; https://github.com/facebookincubator/katran/blob/main/katran/lib/BpfLoader.cpp#L104 ) making lru_miss_stats smaller so it would not block this logic

tehnerd commented 1 year ago

meh. name already in use. going to rename it to something else

frankfeir commented 1 year ago

Hi, @tehnerd, other than lru_miss_stats_vip, there are a few more bpf maps with long names. I feel it is difficult to find a shorter name for them which are as meaningful as their current names so I am trying to check if there is a different option. I did an experiment locally and katran is still functioning with two bpf maps who share the same long prefix(longer than 15 chars) in their names. I heard the 15 char name is just for display via tools like bpftool. Under the hood each bpf map is a file descriptor (in user space), the kernel knows that your two maps are different and assigns them different ids. I am still trying to confirm this. Meanwhile, wondering if we can just adjust the check in BpfLoader and make sure there is no duplicated names after the truncation?(which is the case today)?

tehnerd commented 1 year ago

longer map names breaks runtime reload of bpf balancer's code (e.g. change version from introspection disabled to introspeciton enabled and vice versa) https://github.com/facebookincubator/katran/blob/main/katran/lib/BpfLoader.cpp#L166C1-L166C32 because there is explicit check that map name should not be longer than 15 and right now runtime reload as-is does not work

are you folks planning to continue support this or you are planning to remote this feature?

tehnerd commented 1 year ago

andy eah. you are correct:

tehnerd@YD9R0J7CQ6 bpf % for map in `cat balancer_maps.h | grep SEC | awk '{print $2}'` ; do echo $map ; echo $map | wc ; done
vip_map
       1       1       8
fallback_cache
       1       1      15
lru_mapping
       1       1      12
ch_rings
       1       1       9
reals
       1       1       6
reals_stats
       1       1      12
lru_miss_stats
       1       1      15
vip_miss_stats
       1       1      15
stats
       1       1       6
quic_packets_stats_map
       1       1      23
server_id_map
       1       1      14
server_id_map
       1       1      14
lpm_src_v4
       1       1      11
lpm_src_v6
       1       1      11
global_lru_maps
       1       1      16
fallback_glru
       1       1      14
tpr_packets_stats_map
       1       1      22
server_id_routing_stats
       1       1      24
tehnerd@YD9R0J7CQ6 bpf %

this is what needs to be changed if you would want to unbreak this feature

tehnerd commented 1 year ago

so basically quic_packets_stats_map -> quic_stats_map tpr_packets_stats_map -> tpr_stats_map server_id_routing_stats -> server_id_stats

frankfeir commented 1 year ago

I am not planning to change any behavior. I am thinking to remove the existing check(error out when the name is longer than 15) from BpfLoader and replace it with a different check to make sure no duplicated names after truncation. With that, reload will still work. About the names. quic_stats_map and tpr_stats_map are clear. server_id_routing_stats maybe changed to sid_routing_stats. But it is not obvious to me that vip_miss_stats is for LRU miss stats.

tehnerd commented 1 year ago

That would break the feature. So there is no point to remove this check since code won't work without it

facebook-github-bot commented 1 year ago

@frankfeir has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 year ago

@frankfeir merged this pull request in facebookincubator/katran@be5be9ef5a5e1f95b429572d13bb1a3485fd7b2e.