hildebra / lotus2

Amplicon sequencing pipelines suitable for SSU (16S, 18S), LSU (23S, 28S) and ITS.
http://lotus2.earlham.ac.uk/
GNU General Public License v3.0
52 stars 17 forks source link

contamination of mitochondrion #40

Closed codechenx closed 11 months ago

codechenx commented 1 year ago

I found that Lotus2(v2.23) or RDP classifier is failed to identify contamination of mitochondrion, for example, some OTU actually are mitochondrion sequence, but the taxonomy assignment for these OTU are still bacteria. The blow is the blast result for a OTU:

CleanShot_2023-05-02_at_18 09 47_2x_1683043810426

The command line I used: lotus2 -t 50 -i data -m 16s_map.txt -s sdm_miSeq.txt -o uparse -CL uparse sdm_miSeq.txt

hildebra commented 1 year ago

Hey, this could be and depends on the RDP database, not something we control in LotuS2 directly. However, you can specifically filter out mitochondrial OTUs, if you give a reference database pointing to mitochondria using this flag: -offtargetDB Remove likely contaminant OTUs/ASVs based on alignment to provided fasta. This option is useful for low-bacterial biomass samples, to remove possible host genome contaminations (e.g. human/mouse genome)

I don't have a good reference, but just googling I found this resource: https://www.ncbi.nlm.nih.gov/genome/organelle/ If you have a good, diverse reference database, I would be very happy to include this in the LotuS2 installation to be added as default option.

best, Falk

codechenx commented 1 year ago

Hi Falk, Thanks for your response. I also don't have a good reference, so I just remove these OTU manually.