eunomia-bpf / bpftime

Userspace eBPF runtime for Observability, Network & General Extensions Framework
https://eunomia.dev/bpftime/
MIT License
757 stars 73 forks source link

[FEATURE] Improve per-cpu map performance #335

Open yunwei37 opened 3 weeks ago

yunwei37 commented 3 weeks ago

Is your feature request related to a problem? Please describe.

The per-cpu map has large overhead compare to kernel, which should be fixed.

Map Operation Kernel (op - uprobe) (ns) Userspace (op - uprobe) (ns)
__bench_hash_map_update 62827.533320 30296.051630
__bench_hash_map_lookup 15895.166920 23005.369380
__bench_hash_map_delete 19884.933980 13054.965970
__bench_array_map_update 9538.564600 6701.987970
__bench_array_map_lookup 183.155140 4305.515170
__bench_array_map_delete 216.088950 5987.507820
__bench_per_cpu_hash_map_update 33140.184290 95537.666900
__bench_per_cpu_hash_map_lookup 14089.238230 62913.855920
__bench_per_cpu_hash_map_delete 19753.563580 459826.428910
__bench_per_cpu_array_map_update 8885.238500 25728.928170
__bench_per_cpu_array_map_lookup 1838.737400 8759.420790
__bench_per_cpu_array_map_delete 1867.948100 4802.404130

We need to profile and fix that.

Officeyutong commented 3 weeks ago

Are per cpu maps still using locks and affinity to keep parallel safety?

yunwei37 commented 3 weeks ago

Seems per cpu map does not use locks to keep safety.

The hash map overhead comes from the map implementation. The array map needs more profiling.