igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
634 stars 379 forks source link

Dynseq track support for IGV #1136

Open suragnair opened 2 years ago

suragnair commented 2 years ago

Thanks for your great work! I am a grad student in Anshul Kundaje's lab and would like to propose adding the dynseq track to IGV which we have discussed briefly with @jrobinso on some of the ENCODE calls.

In short, the dynseq track takes as input a bigwig file and scales the height of each base by the value at that position in the bigwig file. This is useful for visualizing interpretation scores from machine learning models of DNA sequence. The tracks have been incorporated into WashU, HiGlass and UCSC browsers and it would be great if IGV would support them. I've attached an image of how it looks. Please let me know how I can help. Thank you!

image

jrobinso commented 2 years ago

You've posted in the IGV desktop repo, as opposed to igv.js, so I assume the request is for IGV desktop.

One difference in IGV desktop and the other browsers you mention is a user typically loads data from a local file, or a URL to a file. This track is a composite, what exactly would a user "load"?

I don't understand the screenshot, the height of the bases in the "importance" tracks don't have any obvious correlation to the wig tracks.

Test data with a corresponding screenshot would be helpful.

suragnair commented 2 years ago

Apologies for the confusion, the dynseq tracks above are not related to the other bigwigs. The bigwigs show signal pileup (ATAC-seq, ChIP-seq) and model predictions for the same, while the dynseq shows the importance scores as per each model.

Here is a simple example with the same bigwig track visualised as a standard bigwig and then as a dynseq track.

image

It uses this bigwig file (mm10). You can find the WashU browser session here. When sufficiently zoomed out, the dynseq reverts to standard bigwig as below.

image

Currently on WashU and UCSC versions, the user only loads a bigwig file. The underlying genome fasta is used for the characters. This works reasonably well. Ideally, it would be very useful if the track also takes a fasta as an additional optional input. This would be useful for visualizing model importance scores for variants, somewhat like in Fig 4 from this paper:

image

Here the top and bottom sequences differ at the T>G variant and the two dynseq tracks show importance scores for each sequence (the G destroys a motif in this case). However, I understand this feature can be somewhat complicated to implement, so I leave it to your discretion.

suragnair commented 2 years ago

Would it be possible to add this functionality to both the desktop version and igv.js? Desktop would be the first priority for us.

jrobinso commented 2 years ago

I'm not likely to find time for this on either platform anytime soon, to be honest, you can see for yourself the number of open issues here and on igv.js, I would need to prioritize this track over those some of which are already high priority. I would entertain a pull request.

As I pointed out above, a challenge for IGV desktop is users load files, there is not in general any external information to tell IGV that this bigwig file is is intended to be used as a dynaseq track. So that would need addressed.

suragnair commented 2 years ago

I see, thanks for the heads up. We can try on our end to submit a pull request. I suppose one solution would be to change the file extension to something like .dynseq, would that be an option?

jrobinso commented 2 years ago

That's one option, probably the easiest one.

kleinjoel commented 2 months ago

I would greatly appreciate it if this feature could be added to IGV Desktop. I'll be keeping an eye on the updates. Thanks for the great work!