Open davmlaw opened 3 months ago
Hi @davmlaw
I agree, it's a slightly unfortunate legacy of some software libraries not properly supporting bgzip files which has resulted in us still using gzip in the files we release. I am investigating the option to at least provide bgzipped alternative files for some of our more heavily used fasta files, though implementing this will take a little time.
In the meantime you can convert the fasta like so:
gunzip -c fasta.gz | bgzip > fasta.bgz
Sorry for the inconvenience!
Perhaps you could also release a .bgz to go with each .gz you release?
Then for VEP - download the .bgz?
Hi @davmlaw
Yes that would be my preference, but it will take a little time to implement.
The GRCh37 fasta file downloaded by the install process is gzipped, not bgzipped
Running the code w/o fasta gives:
Can you please change the GRCh37 file to be bgzipped?
Given that bgzip is backwards compatible with gzip, and only slightly larger, perhaps just make all Ensembl genome fasta files on the website/FTP be bgzipped?