gaolabtools / scNanoGPS

Single cell Nanopore sequencing data for Genotype and Phenotype
Other
39 stars 2 forks source link

Minimap2 parameters and secondary alignments #18

Closed Prakrithi-P closed 9 months ago

Prakrithi-P commented 9 months ago

I used scNanoGPS to analyze Nanopore data with 10X Visium cDNA. The results looked good However I have a few questions about the Minimap2 aligner and what parameter to use The scNanoGPS pipeline by default uses --ax splice in minimap2. However when I use that, some transcripts identified by 10X Visium as primary alignments seem to be secondary alignments/multimapping with minimap2 (although I would expect more confident mapping with long reads) For some transcripts, the strandedness does not seem to be consistent between 10X Visium and Minimap2

Thanks, Prakrithi

shiauck commented 9 months ago

Hi Prakrithi,

Thank you for your feedback. I really appreciate your experience sharing of qPCR result. Unfortunately I'm not the author of minimap2, but I try to share my two cents.

According to minimap2 tutorial, the preset options "map-ont" is based on the criteria: Align noisy long reads of ~10% error rate to a reference genome. This is the default mode. while "splice" is: Long-read spliced alignment (-k15 -w5 --splice -g2k -G200k -A1 -B2 -O2,32 -E1,0 -b0 -C9 -z200 -ub --junc-bonus=9 --cap-sw-mem=0 --splice-flank=yes). In the splice mode, 1) long deletions are taken as introns and represented as the ‘N’ CIGAR operator; 2) long insertions are disabled; 3) deletion and insertion gap costs are different during chaining; 4) the computation of the ‘ms’ tag ignores introns to demote hits to pseudogenes. To my understanding, "map-ont" is designed for searching with, roughly speaking, global alignment strategy for query against reference; while "splice" is designed for searching multiple local alignments for query against reference. That's why "splice" is suitable for transcriptome reads mapping, assuming that most of the mRNAs are spliced.

The reason why you see so many secondary alignments, I guess this might due to conservative functional domains from some exons.

Hope this helps.

Regards, Cheng-Kai