NorwegianVeterinaryInstitute / Nepal

NExtflow Pipeline for AmpLicons
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Build workflow to process amplicon data #39

Open Thomieh73 opened 1 year ago

Thomieh73 commented 1 year ago

See this as a workflow:

Building OTUs, would require that your reads align very nicely. That is pretty hard with Nanopore reads, and you end up with a lot of singletons. I have tried clustering, but it is difficult, and I suspect you need to do basecall corrections to get it to work That later step has the danger that you remove SNPs that were real.

I would do the following.
•   Basecalling and demultiplexing with the latest Guppy using super high accuracy mode.
•   Plot the quality of the reads with Nanoplot, to inspect the distribution of the read length  and the quality of your reads.
•   Next I would do filtering with filtlong or Nanofilt, to get reads that are around the expected amplicon size. And have a minimum quality of 9 or 10 or higher. With filtlong you can filter to keep only the best 90 % of the reads (or different cut-off). That would selected your highest quality reads.
•   Once you have that you can use the EMU tool for classification of the reads. Here is the website of this tool: https://gitlab.com/treangenlab/emu
•   And once you have the output you can decide if you want the classifications at a specific taxonomic rank. That you can use to do diversity analysis.
Thomieh73 commented 3 months ago

This is in progress. Currently testing the workflow "amplicons" which uses EMU as the classification tool