Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
437 stars 150 forks source link

Use extracted data against of compressed [QUESTION] #1692

Open GhorbaniAli opened 3 weeks ago

GhorbaniAli commented 3 weeks ago

Is there a way to make the data files uncompressed?

Examples

# Current
•/vep -i variants.txt -o variants_output.txt --gff 'data-gff.gz' --fasta 'genome.fa.gz'
# Requested
•/vep -i variants.txt -o variants_output.txt -gff 'data.gff' --fasta 'genome.

or

# Current
•/vep --cache -i variants.vcf --custom file='phenotypes.bed.gz', short_name=phenotype
# Requested
•/vep --cache -i variants.vcf --custom file='phenotypes.bed', short_name=phenotype

or

# Current
....... -plugin dbNSFP, 'dbNSFP4.7a.txt.gz'
# Requested
....... -plugin dbNSFP, 'dbNSFP4.7a.txt'

or

....... --custom file='clinvar_20240514.vcf.gz',short_name=ClinVar,format=vcf,type=exact,coords=0
# Requested
....... --custom file='clinvar_20240514.vcf',short_name=ClinVar,format=vcf,type=exact,coords=0
nuno-agostinho commented 3 weeks ago

Hey @GhorbaniAli,

Thanks for your question.

VEP does not accept uncompressed files because of Tabix: Tabix only supports bgzipped files and indexed files are required to quickly transverse through files. It would not be efficient to use non-indexed files.

Hope this information makes it clearer.

Kind regards, Nuno