replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

Idea for the report table: A mini-plot that shows where Ns are in the assemblies #179

Open RaverJay opened 2 years ago

RaverJay commented 2 years ago

Could also include warnings when N stretches are in important regions, e.g. the spike protein

As any mutations/deletions/insertions are just omitted when the corresponding region isn't covered enough, this might give a better overview what to expect.

Not sure on the details - and no idea where to find the time :D

hoelzer commented 2 years ago

:) good suggestion, actually we have something in this direction in place in covPipe, @MarieLataretu implemented a process that checks for important mutations and alerts when they are masked by N.

For example, S:N501Y and when here an amplicon drop happens the user will get informed somehow. Am I right, @MarieLataretu ?

replikation commented 2 years ago

status? @MarieLataretu @RaverJay @hoelzer

RaverJay commented 2 years ago

No time, so i guess this remains in the idea stage :<

MarieLataretu commented 2 years ago

For example, S:N501Y and when here an amplicon drop happens the user will get informed somehow. Am I right, @MarieLataretu ?

You can specify a VCF file - the process checks if the position is N/low coverage and if the variation is found exactly as in the VCF or with another ALT. The output is a hand full of VCF files which are transformed into a table for the summary report in covpipe.

I cloud plug this also into poreCov, but we still have to think about the output

MarieLataretu commented 1 year ago

The topic came up again and we would like to add Nextclade's missing column to the summary data table. missing is the list of detected N nucleotides in alignment coordinates and after all the insertions are stripped.

Also, we would like to have a closer look at the coverage of the spike. I added a second fastcov call with positions to try it out and the plots automatically ended up in the HTML report directly below the full genome coverage plots.

Do you think it's a good place for them, or should they better be published in the results?

replikation commented 1 year ago

sounds good. so it will be in the final report then?

MarieLataretu commented 1 year ago

Yes, with the current implementation, they only appear in the HTML report, which makes it a bit long:

report

hoelzer commented 1 year ago

I like the Spike-focused plots bc/ they are pretty helpful to briefly check for "masked" regions.

Can we just have a "box" that one can extend to look at these plots? So the normal report does not get longer?

replikation commented 1 year ago

+1 for the hide part

MarieLataretu commented 1 year ago

I'll see what I can do :)

You are thinking of the collapsibles we already have in the HTML table, right?

hoelzer commented 1 year ago

Yeah, something like a button below the genome.coverage plots to expand all the spike zoomed plots

MarieLataretu commented 1 year ago

Default

grafik

[bottom of the site]


Expanded

grafik

hoelzer commented 1 year ago

Nice, I like! So with that it should be possible to get a quick idea if mutations are really missing/reverted for a certain lineage or just not called bc/ of low coverage in that region

replikation commented 1 year ago

great love it

On Wed, 30 Nov 2022 at 16:37, Martin Hölzer @.***> wrote:

Nice, I like! So with that it should be possible to get a quick idea if mutations are really missing/reverted for a certain lineage or just not called bc/ of low coverage in that region

— Reply to this email directly, view it on GitHub https://github.com/replikation/poreCov/issues/179#issuecomment-1332359218, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIQSLLQHNS3R7DNCM7QCN7LWK5YEJANCNFSM5JT5DHSA . You are receiving this because you were assigned.Message ID: @.***>