Open vsbuffalo opened 4 months ago
cd05f1f brought in serde+csv.
This, from an API perspective is much cleaner — users can just specify structs and use serde's Deserialize derive attribute to handle parsing.
However, comparing against here appears to be a performance hit. Here is f144ab41a976d693131a68709fa561385cc7a6a8, but with the benches/bedtools_comparison.rs
from HEAD
:
command bedtools time granges time granges speedup (%)
------------ --------------- -------------- ---------------------
map_multiple 139.37 s 67.22 s 51.7679
adjust 60.24 s 29.68 s 50.7238
map_min 66.54 s 45.44 s 31.7153
map_mean 65.98 s 45.53 s 30.9976
map_max 72.54 s 45.03 s 37.9216
map_sum 64.87 s 45.21 s 30.3143
map_median 65.95 s 46.16 s 30.012
flank 83.87 s 47.29 s 43.6118
filter 78.89 s 39.74 s 49.6282
windows 280.89 s 47.56 s 83.0676
Here are two runs on HEAD
:
# run 1
command bedtools time granges time granges speedup (%)
------------ --------------- -------------- ---------------------
map_multiple 134.51 s 73.06 s 45.6827
adjust 59.98 s 65.52 s -9.23242
map_min 66.54 s 54.71 s 17.7839
map_mean 65.75 s 55.76 s 15.2025
map_max 67.37 s 55.96 s 16.942
map_sum 64.78 s 54.68 s 15.5909
map_median 66.60 s 54.59 s 18.0371
flank 84.39 s 31.96 s 62.1299
filter 78.31 s 41.01 s 47.6287
windows 281.53 s 149.98 s 46.7274
# run 2
command bedtools time granges time granges speedup (%)
------------ --------------- -------------- ---------------------
map_multiple 137.36 s 73.51 s 46.4823
adjust 59.91 s 65.41 s -9.19381
map_min 66.01 s 54.51 s 17.4298
map_mean 66.05 s 57.25 s 13.3131
map_max 69.62 s 54.67 s 21.4671
map_sum 64.61 s 54.60 s 15.4947
map_median 66.99 s 55.53 s 17.1005
flank 84.60 s 32.04 s 62.1338
filter 78.90 s 41.21 s 47.7687
windows 283.60 s 150.69 s 46.8675
So far, it looks like serde's deserialization lead to speed ups, but the serialization (or maybe csv
) is slower.
Making windows is a fast operation in absolute terms, so this matters little... but one can see the cost of serde deserialize versus the old TsvSerialize
approach here:
Updates: this is on f991e23277a8557a10a447c0d7af118c4847cb83 which may disappear as it's squashed etc.
python scripts/benchmark_summary.py
command bedtools time granges time granges speedup (%)
------------ --------------- -------------- ---------------------
map_median 110.73 s 95.96 s 13.3354
map_sum 108.87 s 95.72 s 12.0726
map_max 114.47 s 84.39 s 26.2822
adjust 109.28 s 80.88 s 25.9942
flank 145.60 s 50.54 s 65.2901
map_multiple 296.48 s 119.34 s 59.7466
map_mean 109.97 s 94.83 s 13.7727
filter 118.36 s 58.79 s 50.3318
merge_empty 63.51 s 31.01 s 51.181
windows 515.97 s 173.53 s 66.3682
map_min 114.82 s 94.74 s 17.4876
with --features=bench-big
python scripts/benchmark_summary.py
command bedtools time granges time granges speedup (%)
------------ --------------- -------------- ---------------------
map_median 524.46 s 547.00 s -4.29831
map_sum 510.89 s 539.91 s -5.67975
map_max 505.62 s 535.22 s -5.85555
adjust 968.14 s 696.75 s 28.0319
flank 22.22 min 397.84 s 70.1543
map_multiple 21.01 min 577.71 s 54.1614
map_mean 502.51 s 540.29 s -7.51975
filter 20.25 min 641.71 s 47.1769
merge_empty 447.83 s 210.99 s 52.8856
windows 519.41 s 172.34 s 66.8201
map_min 503.93 s 538.19 s -6.79718
So parsing is slow, but something in particular isn't scaling well.
GRange's parsers are slow-ish. I got a 20-25% gain in speed from using serde + csv. But, I think
String
types are killing us, compared to raw bytes. But to materialize those benefits a new ASCII or raw byte vector type is need through out (I think noodles uses a similar approach?).