hildebra / lotus2

Amplicon sequencing pipelines suitable for SSU (16S, 18S), LSU (23S, 28S) and ITS.
http://lotus2.earlham.ac.uk/
GNU General Public License v3.0
52 stars 17 forks source link

only get a fraction of OTUs back when running lotus2 -taxOnly #51

Closed slambrechts closed 6 months ago

slambrechts commented 9 months ago

Hi,

When I run lotus2 -taxOnly with

lotus2 -taxOnly /kyukon/scratch/gent/vo/001/gvo00123/vsc46214/CRABS/otus92.fa -o lotustax -refDB Olig01_Annelida_crabs_db.fasta -tax4refDB Olig01_Annelida_crabs_db.tax -taxAligner blast -ITSx 0 -LCA_idthresh 94,80,75,70,65,60 -lulu 0 -t 64

I only get 99 of 152 OTUs back in the resulting otus92.fa.hier file. The other ones are completely missing from the file

Do you have any idea why this is happening?

Best, Sam

slambrechts commented 9 months ago

I checked manually for OTU57 using the ncbi blastn server, and I think the best hit (which I confirmed to be included in the custom database I used with lotus), was not used during blast for this OTU, i.e. the OTU57 was not searched against GU901768 according to the tax.0.blast file in the /ExtraFiles output folder. Is it possible this has something to do with the-max_target_seqs setting used by blastn in lotus2?

hildebra commented 8 months ago

Hey Sam, can you try the option "-keepUnclassified 1 -keepTmpFiles 1 ". I suspect these 53 OTUs were not having a good hit against the ref database. Using the -keepTmpFiles flag will retain the .m8 alignment file, so you could check within that file what happened to these OTUs. hth, Falk

slambrechts commented 6 months ago

Hi Falk,

I used -keepUnclassified 1 -keepTmpFiles 1, but the problem still exists. I'm only getting a fraction of the OTUs back in the resulting otus92.fa.hier file. Is there an option to change this behavior, and have lotus2 output "unknown" for all taxonomic ranks or something for the OTUs that get omitted?

Cheers, Sam

hildebra commented 6 months ago

Hey Sam, I think this behaviour is hard coded in the program LCA. I'll have to check the C++ source code. Did you find the .m8 file that is the raw lambda output? Please note that in the latest lotus2 we switched to a new lambda version. best, Falk