gogetdata / ggd-cli

The command-line interface to GGD
MIT License
42 stars 3 forks source link

storing bgzip compressed genomes? #33

Open golobor opened 4 years ago

golobor commented 4 years ago

hi! this project looks super interesting! One issue that would personally concern me if I were to use ggd is that the existing recipes store genomes in an uncompressed form. My concern is that, with potentially many genomes that my lab would have to deal with, the library will take a lot of space; moreover, given that we use network storage, storing data uncompressed will reduce the I/O performance.

Have you considered allowing optional compression of genomes with bgzip? bgzip plays well with faidx/pyfaidx and does not have any downsides, at least as much as we're concerned.

Thank you! Anton.

tgotwig commented 3 years ago

Any update on this? 🤔