tleonardi / nanocompore

RNA modifications detection from Nanopore dRNA-Seq data
https://nanocompore.rna.rocks
GNU General Public License v3.0
80 stars 12 forks source link

Parameters recommendation to analyse SARS-CoV-2 data generated us Oxford Nanopore DRS protocol #226

Closed Rohit-Satyam closed 10 months ago

Rohit-Satyam commented 10 months ago

Hi

I have a query related to how tools are run in the nanocompore pipeline. Since your protocol is latest, I still wish to confirm if I need to make some changes according to what has been described in one of the paper here such as

  1. Identified viral reads mapped to the viral genome using minimap2 options “-k 8 -w 1–splice -g 30000 -G 30000 -A1 -B2 -O2,24 -E1,0 -C0 -z 400,200–no-end-flt–junc-bonus=100 -F 40000 -N 32–splice-flank=no–max-chain-skip=40 -un–junc-bed=FILE -p 0.7.”
  2. Chimeric read removal: Chimeric reads were filtered out according to the flag from minimap2? Is this step important.

Apologies but I am trying to follow good practices and if these parameters are important I would like to include them in my custom shell script.

lmulroney commented 10 months ago

Hi @Rohit-Satyam,

No need to apologise for wanting to follow good practices, that should be commended.

Step 1 is really important when aligning data to the SARS-CoV-2 genome, and when aligning to the genome you should follow these alignment conditions you put here as detailed in the kim et al paper (your link). However, nanocompore is designed for alignments to the transcriptome, rather than the genome. The parallelisation and memory usage will not handle alignments to the genome well, especially if you have a lot of data. This is possible, but will likely require more memory and take longer than if you aligned the data to the transcriptome reference.

There are a few different SARS-CoV-2 transcriptome references available that are composed of the different sgRNA transcripts. You can use the default map-ont minimpa2 parameters when mapping to the SARS-CoV-2 transcriptome reference. You can include a bed file of the transcirptome reference that you use to lift over the transcriptome coordinates to the genome coordinates within nanocompore.

Step 2 didn't seem as impactful in our analysis, and is something you can choose to add or not depending on how much time you want to test these different parameters and on your biological question.

I hope this helps, Logan

Rohit-Satyam commented 10 months ago

Thanks a lot, @lmulroney for the insights. An assurance from the experts like you is all I require. Will close this issue!