igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
640 stars 384 forks source link

[Feature Request] Displaying WGS CNV data for all chromosomes, not just individual chromosomes. #1224

Open fidibidi opened 2 years ago

fidibidi commented 2 years ago

Hello all!

We are working with CNV data and would like to display Dragen CNV vcf files for the whole genome instead of the currently available chromosome by chromosome mode.

Attached is an example from Bionano data, and how they execute this feature.

bionano

It would be immensely useful for us to be able to have a quick gestalt at the whole genome for chromosome gains and losses without having to click through each chromosome one by one.

Thank you!

jrobinso commented 2 years ago

Did you try selecting whole genome view (chromosome "All")?

jrobinso commented 2 years ago

Ah, sorry, you specified VCF files. We don't have such a view for VCF data, its oriented towards snps. I don't even have sample data of that type to work with.

jrobinso commented 2 years ago

I think there are 2 issues here. (1) whole genome display for VCF files, and (2) a copy number display mode for CNV data encoded in VCFs.

The first is possible now, but not with indexed VCF files (vcf.gz files with associated vcf.gz.tbi). In theory you can get a whole genome display of any file by either (1) loading it without an index, or (2) setting the visibility window of the track to zero. In both cases the file is loaded in its entirety and you can thus get a whole genome display (chromsome "All"). In actuality option (2) is not working, that is a bug, and will be fixed for the next release. However if you remove the VCF index file (the file ending in ".gz.tbi" you can get a whole genome view with the current release.

A display oriented towards copy number from VCF files does not exist, and I'm not sure if I should interpret this issue as a request for such a display, or merely a request for the current VCF display at whole genome view. Could you clarify?

fidibidi commented 2 years ago

After meeting with the team, here are some clarifications about our request. We would like to display the b-allele frequency in the bigWig format viewed as points, and display the copy number calls in the GFF3 format genome wide. The reason we want GFF3 and not vcf, is because the vcf file does not contain the copy number call unless it is clicked. The GFF3 provides a color code for the copy number calls.

An additional request: is there a way to show text in the viewer for GFF3 file data?

(We are also happy to show examples, or provide data to help with this if needed.)

jrobinso commented 2 years ago

OK, well whole genome views of bigwig data are supported now. Whole genome views of feature and variant file formats, including VCF and GFF3, are not. I was partly mistaken on that, the web version of IGV does support whole genome views, but the desktop version currently does not. I am working on this but do not have a time estimate for it yet.

I don't understand the request to "show text".

Test data would be helpful, especially a Dragen VCF file corresponding to the figure in your initial post. I can send you a secure dropbox link if you email igvteam (at) broadinstitute.org.

fidibidi commented 2 years ago

Hi Jim! Thanks for answering the question on whole genome view for CNVs.

Here are some (hopefully) clarifications to your questions: We are trying to review structural variants in IGV and currently, we use a combination of GFF3, Bed and VCF file types.

for GFF3: Is there a way to have text appear underneath feature, inside of row, as seen in below image for the bed track? Screen Shot 2022-10-10 at 1 44 48 PM

for VCF: Some of the features are unclickable, and we aren't sure why.

unclickable Screen Shot 2022-10-10 at 1 48 10 PM

clickable Screen Shot 2022-10-10 at 1 49 38 PM

Upon further investigation, it appears that variants with a defined alternate allele ( not DEL ) are clickable in IGV to see additional information.

As a consequence, we can't tell what type of structural variant it is from IGV.

Below are some examples from the vcf.

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  A0818
chr1    789481  MantaINS:3:135:135:2:5:0    G   <INS>   999 PASS    END=789481;SVTYPE=INS;CIPOS=0,7;CIEND=0,7;HOMLEN=7;HOMSEQ=GAATGGA;LEFT_SVINSSEQ=GAATGGAATGCAATGGAATGCACTCGAACGGATTGGAATGGAATGGACTCGAATAGAATGGAATAAAATGAAATGGACTCCAATGGATTGGAATGGAATTGACTCCAATGGAATTGAATGGAGTGGAACCGAATGGAACGGATTGGAATGGAATGCACTCGAAATGAATTTGAATGGAATGGATTGGGCTCAAATGGAATGGAATGGAATGGAATGGAATGGAATGAACTCAAATGGATTAGCATGGAATGAAGTGGACTCGAATACAATGGAATGGAATGGACTCGAATGGAATGGAACGGACTTGAACGGAATGGAGTGGAATGGACTCGAATGGAATGGAGTTGAATGGACTCGAATGGAATGGAATGTAAAGGAATGGAATGAACTCGAAAGGAGTGGAATGTAATGGAATGAAATGGACTCGAATGGAATTAAATGGAATGGAACGGAATGGACTGGGATGGAATGGAACGGAACGGAACGCAGTTGAATTGAACGG;RIGHT_SVINSSEQ=GAATGGAATAGACTGAAATGAAATGGAATGTACTGGAATGGAATGGAATGGAATGTACTGGAATGGAATGGAATGGACTCGAATGATATGCAATTGAATGGACTCGCATGGATTGGAATGGACTCTAGTGGAATGGAATGGAATA GT:FT:GQ:PL:PR:SR   1/1:PASS:74:999,77,0:5,17:0,45
chr1    933944  MantaDEL:30151:0:1:0:0:0    A   <DEL>   174 PASS    END=934965;SVTYPE=DEL;SVLEN=-1021;IMPRECISE;CIPOS=-468,468;CIEND=-349,349   GT:FT:GQ:PL:PR  0/1:PASS:174:224,0,255:18,10
chr1    1324180 MantaINS:371:173:173:3:0:0  GA  GGGGCTGGGGGGCTGAGGGGCTGGGGGGCTGGGGGGCTGCTGGGCTGAGAGGCTGGGAGACTGGAGGGCTGTGG  652 PASS    END=1324181;SVTYPE=INS;SVLEN=73;CIGAR=1M73I1D   GT:FT:GQ:PL:PR:SR   1/1:PASS:48:705,51,0:0,0:1,19
chr1    1350100 MantaDEL:30222:0:1:0:0:0    G   <DEL>   126 PASS    END=1351511;SVTYPE=DEL;SVLEN=-1411;IMPRECISE;CIPOS=-389,389;CIEND=-435,436  GT:FT:GQ:PL:PR  0/1:PASS:126:176,0,355:24,8
chr1    1530531 MantaDEL:30232:0:1:0:0:0    TAAAAACTAAGAATAATGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCCGGCTAAAACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGTAGTGGCGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCATGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAA    T   410 PASS    END=1530850;SVTYPE=DEL;SVLEN=-319;CIGAR=1M319D;CIPOS=0,16;HOMLEN=16;HOMSEQ=AAAAACTAAGAATAAT GT:FT:GQ:PL:PR:SR   0/1:PASS:410:460,0,716:34,0:39,16
chr1    1546903 MantaINS:30226:0:0:0:1:0    A   ACGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGCGGATCATCTGAGGTCAGGAGTTTGAGACCAGCCTGGCCAACATGGAGAAACCCCGTCTCTACCAAAAGTACAAAATTAGCCGGGCACGGTGGTGGG 561 PASS    END=1546903;SVTYPE=INS;SVLEN=134;CIGAR=1M134I;CIPOS=0,1;HOMLEN=1;HOMSEQ=C   GT:FT:GQ:PL:PR:SR   0/1:PASS:14:611,0,11:0,1:11,21
chr1    1565630 MantaDUP:TANDEM:30231:0:1:0:0:0 G   <INS>   264 PASS    END=1565630;SVTYPE=INS;SVLEN=97;DUPSVLEN=97;CIPOS=0,1;CIEND=0,1;DUPHOMLEN=1;DUPHOMSEQ=T GT:FT:GQ:PL:PR:SR   0/1:PASS:264:314,0,701:0,0:48,13
chr1    1565683 MantaINS:30231:0:0:0:0:0    GT  GGTGGTGCAGGCAGAGAACAGACGTCGCGATGGGCCCGACGGTGCTGGCTCCATGGGAACCGAGACCCAACACCCAAAGGAGTCCCACAGGCTCAGGGG 752 PASS    END=1565684;SVTYPE=INS;SVLEN=98;CIGAR=1M98I1D   GT:FT:GQ:PL:PR:SR   0/1:PASS:135:802,0,132:0,0:13,23

FYI: I tried to email you at igvteam (at) broadinstitute.org, and got an email does not exist error, is there a mistake in the address? I'd love to send you the VCF file.

Thanks!

fidibidi commented 2 years ago

PS:

Will the circular view ever be integrated into the desktop version of IGV?

https://github.com/igvteam/igv-webapp/wiki/Circular-View

helgathorv commented 2 years ago

The circular view is available now in the IGV desktop app, but you have to enable it in the preferences. See https://github.com/igvteam/igv/wiki/Circular-View

jrobinso commented 2 years ago

Sorry its igv-team, not igvteam, my mistake.

RE gff3, I'm not sure I am understanding the question, but if you set a "Name" property in column 9 it should display below the feature. Looking at the code, this question must have come up before as there appears to be a directive to set an alternative property for the display label. Adding the following to the header of your GFF3 file might do what you are after

##displayName=SVTYPE

WRT the larger issues, whole genome view is not well supported in IGV desktop, it is much better in igv.js / igv-web. The refactoring involved to support this generally is extensive, work has begun in a branch but it will take some time.

Also, in general SVs for VCFs are not supported well in IGV desktop, again the situation is better for igv-web, but it is an important area for our current grant and we will improve. I am not optimistic there will ever be accepted standards for SVs in VCFs, we will just have to work from conventions which vary from tool to tool. That is why example datasets from as many tools as possible are so helpful. I am going to create a label for SVs in VCFs as an area of work, starting with this issue.