igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
635 stars 379 forks source link

"infinite" coverage in base modification alignments #1574

Open SamieJaffreyLab opened 2 days ago

SamieJaffreyLab commented 2 days ago

Hi, I am trying to analyze m6A from some direct-RNA Nanopore data. In IGV, when I select "color alignments by" and "base modification", m6A bases are colored in turquoise and shaded based on their quality scores. The coverage track shows m6A as a percentage of the total A's. However, in many instances, the nanopore basecaller assigns a base as m6A (even if only 1% confidence) yet the reference is annotated as a non-A base. When this happens, the coverage track calculates the m6A percentage as 1 m6A divided by 0 total A's. These sites go to infinity and make it difficult to view the real m6A sites and their stoichiometry. I've attached images to help clarify the problem: in the first image, the left green bar is a miscalled m6A and the right bar is a correctly called m6A. Is there something I can do to bypass/ edit the coverage track's calculation? Or is it possible for you to address this issue on your end? Thanks for the help!

Screen Shot 2024-09-20 at 6 01 54 PM Screen Shot 2024-09-20 at 5 51 16 PM
jrobinso commented 2 days ago

Thanks for the report. Could you verify that you are using the latest version of IGV? It should be impossible for the position at the left bar to have any 6mA calls, the math for the MM/ML spec would not allow it in the first place. 6mA calls can only occur at read base A or T.

If you are using the latest version of IGV I will need a test bam to figure out what is going on here. It doesn't need to be the whole bam file, just a small slice with data exhibiting this issue. You can send a sample to igv-team@broadinstitute.org, or emails us and we can send you a dropbox link.

Looking at one of your reads this could be a call in a soft-clipped region. It would be a bug to include this call in the coverage track summarization, and looking at the code I'm not sure how that could happen, but its possible that is causing the issue. Again a test file that reproduces the issue would be helpful.

jrobinso commented 2 days ago

Also, in the latest release the likelihood threshold at which a base is considered modified is settable as a number between 0 and 1 in preferences (View > Preferences > Base Mods). It defaults to 0 as any other value would be arbitrary.