Closed bovee closed 6 years ago
This looks like something we can speed it up a lot by just using a BufReader
in front of serde. On my machine, I see a speed-up of 17x (went to 12 sec loading that refseq_sketches_21_1000.sk
file).
@bovee 12 seconds feels like something we can live with. :)
It takes ~3.5 minutes to do a full comparison of a small test genome against the 315Mb k=21/n=1000 RefSeq database sketch file. While this isn't super slow, it would be nice if this was more firmly in the <1 minute range.
98.6% of this time (as determined by Instruments) is spent deserializing the sketch JSON while 0.8% is spent doing all the comparisons. :/ The easiest way to close this is to periodically check if there are any speed improvements in Serde and update appropriately. For upstream issue, see:
https://github.com/serde-rs/json/issues/160