igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
637 stars 384 forks source link

Error loading VCF: Tag Type in wrong order #1530

Open jules-o opened 3 months ago

jules-o commented 3 months ago

Hi,

I'd like to view VCFs from the 1000 Genomes Project in IGV (version 2.17.4 with Java included) but a couple of them won't load, with an error message:

"Error loading [URL]: Unable to parse header with error: Your input file has a malformed header: Tag Type in wrong order (was #2, expected #3)"

One of the files that causes this error is https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20220422_3202_phased_SNV_INDEL_SV/1kGP_high_coverage_Illumina.chr10.filtered.SNV_INDEL_SV_phased_panel.vcf.gz

The only search result for this error is this issue in samtools / hts-specs, but that was marked as 'done' last year.

Is there any way to get IGV to accept the VCF as it is or do I need to find a way to edit the order of the tags?

Thank you

jrobinso commented 3 months ago

From the thread you linked to I gather the consensus is field order should not be required, but its not absolutely clear what was done WRT this in the spec. In any event our library, the htsjdk, is apparently requiring an order. Short of a change to the htsjdk you will need to change the ordering of the tags to conform to its expectations.

You could open an issue at https://github.com/samtools/htsjdk, and reference the spec discussion you linked to above.

FWIW the igv webapp can load these files (https://igv.org/app). If you use the web app and load by URL, be sure to include the index url (the .tbi file) or it will load the entire file.

jules-o commented 3 months ago

Oh that's really helpful, thank you. I only really need an overview of one region, so I will see if I can do that in the web app, and if not then I will edit the file (and possibly raise the issue with htsjdk).