Closed kingtom2016 closed 1 year ago
Hey kingtom2016, very good point, by default LotuS2 will use a 97% id cutoff, also for minimap2. I can imagine that this cutoff might indeed in some cases lead to false mapping of reads between samples; however for OTUs that are reads clustered at 97% id, I wouldn't see a problem. Rather for ASVs or zOTUs. So I have added a new flag "-backmap_id" that one can set to eg "-backmap_id 0.99" to only backmap reads at 99% id. Also I changed the default behaviour, that 99% will be used by default, when ASVs or zOTUs are being clustered. I pushed this now to the git repo, if it is stable I'll push it to conda later. If you could give me some feedback on the new github version, that would be much appreciated (version 2.25). cheers, Falk
Thanks! It works well. I tested this parameter in 0.97(original) 0.99 and 1 using samples from five different habitats.
Compared with 0.97, setting 0.99 will discard averagely 8% reads number (1%~20%).
Compared with 0.99, setting 1 will discard averagely 30% reads number (6%~42%).
It seems to lose considerable reads. :(
lotus2 -i $PWD -m $PWD/1_miSeqMap.sm.txt \ -s /mnt/d/Myfile/DATA/beforework/lotus2/1sdm_miSeq_bio.txt \ -o $output_fold \ -p miSeq -amplicon_type SSU -tax_group bacteria \ -forwardPrimer $front_f \ -reversePrimer $front_r \ -CL dada2 -id 1 -refDB SLV -taxAligner vsearch \ -rdp_thr 0.7 -buildPhylo 0 -t 16 -sdmThreads 1 -lulu 1 -backmap_id $1
Yes I would expect something like this, intuitively I would think that 100% id is too drastic. Another problem is that DADA2 does not natively report back the clusterings, but always requires to do a backmapping of ALL reads onto the ASVs. This is different with almost all other clustering approaches in LotuS2 and might give them a little advantage therefore. However, mid quality reads will always be backmapped, in all clustering approaches, as they are ignored for the clustering itself (they will for the most part just overexaggerate diversity due to being noisy)
Thanks for your rapid reply. I believe it is not a big deal now. Wish you a good day :)
Lotus2 is a great tool!
I am recently curious that the table generated by Lotus2 shows more shared ASV between samples than other tools like QIIME2. Do you notice or have this phenomenon? Are they false positve or real shared between samples? If so, changing minimap2 parameters may alleviate this problem. I guess the threshold needed to be tuned for different Lotus2 setting (maping reads for ASV and 97% OTU intuitively requried different parameters)