sgkit-dev / vcf-zarr-publication

Manuscript and associated scripts for vcf-zarr publication
2 stars 7 forks source link

vcf-zarr-publication

This repo contains the manuscript for the publication describing the vcf-zarr specification and its compression and query performance on several datasets. All code required to generate figures and example analyses is in this repo.

Layout

To run the simulation based benchmarks:

  1. cd to the software directory and make (you may need to install some dependencies).
  2. cd to the scaling directory and make. This will take a long time and need a lot of storage space.
  3. Run the various benchmarks using python src/collect_data.py