Open TroyKomodo opened 2 months ago
Something like this, perhaps we can have a code path when the registry has this trait, not sure. patch.txt
perhaps we can have a code path when the registry has this trait, not sure.
This sounds like specialization, so I'm afraid this will not be possible anytime soon. Or rather there would need to be two filter inplementations, one general and one with lookup and the user would have to choose which one to use.
That is if we want to avoid breaking changes which adding a bound there is.
Is it possible to use the slab storage directly in the env_filter rather than using a rwlock?
I also think at a minimum its worth considering only doing anything with the 2 mutexs if there is at least 1 dynamic directive.
I have done a benchmark using https://docs.rs/scc/2.1.0/scc/hash_map/struct.HashMap.html as the hash map rather then the RwLock<HashMap<_, _>>
https://github.com/tokio-rs/tracing/blob/36bf06310c143e24fb4761dcb7e88a46c99b053c/tracing-subscriber/src/filter/env/mod.rs#L201-L202
normal env filter
Running 10s test @ http://127.0.0.1:8080/hello
40 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.08ms 1.98ms 51.93ms 93.00%
Req/Sec 32.08k 4.44k 93.11k 77.15%
Latency Distribution
50% 627.00us
75% 1.15ms
90% 2.42ms
99% 5.30ms
12868373 requests in 10.10s, 1.91GB read
Requests/sec: 1274081.70
Transfer/sec: 193.19MB
env filter with scc hash map
Running 10s test @ http://127.0.0.1:8080/hello
40 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 371.94us 846.78us 30.63ms 97.67%
Req/Sec 73.95k 2.93k 102.92k 80.79%
Latency Distribution
50% 280.00us
75% 391.00us
90% 550.00us
99% 1.57ms
29728054 requests in 10.10s, 4.40GB read
Requests/sec: 2943392.96
Transfer/sec: 446.32MB
level filter
Running 10s test @ http://127.0.0.1:8080/hello
40 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 426.64us 1.58ms 45.88ms 99.59%
Req/Sec 73.53k 4.19k 101.11k 88.06%
Latency Distribution
50% 272.00us
75% 408.00us
90% 625.00us
99% 1.55ms
29544121 requests in 10.10s, 4.37GB read
Requests/sec: 2925205.39
Transfer/sec: 443.56MB
From looking at these results it seems that env filter now adds no overhead and still works the exact same way. In the env filter i can still filter by spans and variables. The only difference is there is no mutex.
The reason its a big difference is because i am using a 128 core machine. Repeating the test but only using 4 cores.
normal env filter
Running 10s test @ http://127.0.0.1:8080/hello
40 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.65ms 5.41ms 96.66ms 86.83%
Req/Sec 5.66k 645.49 9.36k 71.33%
Latency Distribution
50% 4.38ms
75% 4.92ms
90% 13.56ms
99% 22.64ms
2272637 requests in 10.10s, 344.61MB read
Requests/sec: 225020.20
Transfer/sec: 34.12MB
env filter with scc hash map
Running 10s test @ http://127.0.0.1:8080/hello
40 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.61ms 5.92ms 74.38ms 86.51%
Req/Sec 6.20k 764.95 10.71k 71.84%
Latency Distribution
50% 3.99ms
75% 4.52ms
90% 15.37ms
99% 25.34ms
2487946 requests in 10.10s, 377.26MB read
Requests/sec: 246345.68
Transfer/sec: 37.35MB
level filter
Running 10s test @ http://127.0.0.1:8080/hello
40 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 4.16ms 1.47ms 53.88ms 96.18%
Req/Sec 6.13k 410.84 9.86k 97.67%
Latency Distribution
50% 4.28ms
75% 4.42ms
90% 4.59ms
99% 4.83ms
2462271 requests in 10.10s, 373.36MB read
Requests/sec: 243786.75
Transfer/sec: 36.97MB
What do you mean by "level filter"?
It seems completely reasonable to me to replace the locked hashmap with a concurrent variant. I'd just like to check if we don't already (transitively) depend on another one (flurry maybe?) so that we wouldn't have to add different dependency.
What do you mean by "level filter"?
https://docs.rs/tracing/latest/tracing/level_filters/struct.LevelFilter.html
It seems completely reasonable to me to replace the locked hashmap with a concurrent variant. I'd just like to check if we don't already (transitively) depend on another one (flurry maybe?) so that we wouldn't have to add different dependency.
That sounds good, the scc crate is 2 deps i believe, scc itself and the backer crate
Feature Request
Crates
tracing_subscriber
Motivation
Having a lock free implementation would be very nice performance boost.
Proposal
We can already do something like this if we use the registry lookup span and extensions. however env filter currently does not require that trait.
Alternatives
Perhaps disable the lock code paths if there are no dynamic filters.