RedisBloom / t-digest-c

Wicked Fast, Accurate Quantiles Using 'T-Digests'
Other
15 stars 4 forks source link

Added trimmed mean and symetric trimmed mean implementations and tests #22

Closed filipecosta90 closed 2 years ago

filipecosta90 commented 2 years ago

Fixes #16 . This PR adds the following 2 new APIs:

To test out I've used scipy's stats trim_mean

To get a grasp of the trimmed means performance I've added a set of benchmarks. Here are the full benchmark results:

-------------------------------------------------------------------------------------------------------------------
Benchmark                                                         Time             CPU   Iterations UserCounters...
-------------------------------------------------------------------------------------------------------------------
BM_td_add_uniform_dist/100/10000000                       546590279 ns    546571841 ns           26 Centroid_Count=70 Total_Compressions=481.984k items_per_second=703.687k/s
BM_td_add_uniform_dist/200/10000000                       582119111 ns    582098920 ns           24 Centroid_Count=116 Total_Compressions=219.631k items_per_second=715.8k/s
BM_td_add_uniform_dist/300/10000000                       605849089 ns    605828072 ns           23 Centroid_Count=160 Total_Compressions=139.58k items_per_second=717.667k/s
BM_td_add_uniform_dist/400/10000000                       621972610 ns    621953201 ns           22 Centroid_Count=199 Total_Compressions=99.732k items_per_second=730.835k/s
BM_td_add_uniform_dist/500/10000000                       634853925 ns    634832442 ns           22 Centroid_Count=241 Total_Compressions=79.604k items_per_second=716.009k/s
BM_td_add_lognormal_dist/100/10000000                     546540266 ns    546521255 ns           26 Centroid_Count=68 Total_Compressions=481.506k items_per_second=703.752k/s
BM_td_add_lognormal_dist/200/10000000                     582261780 ns    582242339 ns           24 Centroid_Count=114 Total_Compressions=219.596k items_per_second=715.624k/s
BM_td_add_lognormal_dist/300/10000000                     604777683 ns    604757111 ns           23 Centroid_Count=157 Total_Compressions=139.422k items_per_second=718.938k/s
BM_td_add_lognormal_dist/400/10000000                     622590766 ns    622569543 ns           22 Centroid_Count=200 Total_Compressions=99.753k items_per_second=730.112k/s
BM_td_add_lognormal_dist/500/10000000                     634788521 ns    634767624 ns           22 Centroid_Count=245 Total_Compressions=79.66k items_per_second=716.082k/s
BM_td_quantile_lognormal_dist/100/10000000                586780631 ns    586761401 ns           25 items_per_second=681.708k/s
BM_td_quantile_lognormal_dist/200/10000000                811997490 ns    811971767 ns           17 items_per_second=724.453k/s
BM_td_quantile_lognormal_dist/300/10000000               1016801894 ns   1016770247 ns           14 items_per_second=702.505k/s
BM_td_quantile_lognormal_dist/400/10000000               1286479714 ns   1286436697 ns           11 items_per_second=706.674k/s
BM_td_quantile_lognormal_dist/500/10000000               1483928439 ns   1483880300 ns            9 items_per_second=748.788k/s
BM_td_merge_lognormal_dist/100/10000000                   141674160 ns    141670125 ns           88 items_per_second=8.02119k/s
BM_td_merge_lognormal_dist/200/10000000                   286590413 ns    286582373 ns           50 items_per_second=6.9788k/s
BM_td_merge_lognormal_dist/300/10000000                   392189307 ns    392178201 ns           32 items_per_second=7.96832k/s
BM_td_merge_lognormal_dist/400/10000000                   532181754 ns    532167090 ns           25 items_per_second=7.51644k/s
BM_td_merge_lognormal_dist/500/10000000                   675686837 ns    675667441 ns           19 items_per_second=7.78957k/s
BM_td_trimmed_mean_symmetric_lognormal_dist/100/10000000  868254617 ns    868223418 ns           17 items_per_second=677.516k/s
BM_td_trimmed_mean_symmetric_lognormal_dist/200/10000000 1305046601 ns   1305005590 ns           10 items_per_second=766.28k/s
BM_td_trimmed_mean_symmetric_lognormal_dist/300/10000000 1763557703 ns   1763499276 ns            8 items_per_second=708.818k/s
BM_td_trimmed_mean_symmetric_lognormal_dist/400/10000000 2188760113 ns   2188686563 ns            6 items_per_second=761.492k/s
BM_td_trimmed_mean_symmetric_lognormal_dist/500/10000000 2692495267 ns   2692408313 ns            5 items_per_second=742.829k/s
codecov[bot] commented 2 years ago

Codecov Report

Merging #22 (4165665) into master (f6ee51b) will increase coverage by 1.73%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #22      +/-   ##
==========================================
+ Coverage   84.98%   86.71%   +1.73%     
==========================================
  Files           1        1              
  Lines         253      286      +33     
==========================================
+ Hits          215      248      +33     
  Misses         38       38              
Impacted Files Coverage Δ
src/tdigest.c 86.71% <100.00%> (+1.73%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f6ee51b...4165665. Read the comment docs.