sgkit-dev / vcztools

Partial reimplementation of bcftools for VCF Zarr
Apache License 2.0
1 stars 3 forks source link

Performance tests #55

Open tomwhite opened 1 month ago

tomwhite commented 1 month ago

It would be good to have some tests to check the performance of various vcztools commands. The thing to check is that running highly-selective queries (i.e. ones that just return a few records) on large datasets is fast.

They could be like the ones @jeromekelleher ran in https://github.com/sgkit-dev/vcztools/pull/8. These are not amenable to running on CI, so it's probably fine to have some instructions to run them manually on 1000G chromosome 2 or 20 data.

jeromekelleher commented 2 weeks ago

That would be excellent. I think a simple script that runs some commands, and outputs timings and output rate (Mb/s) for vcztools and bcftools would be sufficient. We don't need anything elaborate.