torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
643 stars 123 forks source link

vsearch fails to assign taxonomy for Fungi ITS seqs #489

Closed cam315 closed 2 years ago

cam315 commented 2 years ago

By using _vsearch --sintax test.fasta --db fungi.ITS.fasta.gz --threads 8 --tabbedout otus.tax --sintaxcutoff 0.7 --strand both, I tested on a subset of fungi ITS data which were extracted from the Unite (https://unite.ut.ee/) ITS reference database. However, I was confused that the output as follows:

s4 s2 s1 s5 k:Eukaryota(1.00) + k:Eukaryota s3 k:Eukaryota(1.00) + k:Eukaryota

Actually, these five ITS seqs should be reported with clear taxonomic info based on the reference db. When I use the same strategy to test with 16S sequences, all the sequences could be correctly assigned. I was wondering whether there is any tricky that I missed or vsearch --sintax is not very applicable short sequences like ITS (e.g. 150bp). Looking forward to your instruction and suggestion.

cam315 commented 2 years ago

Solved. It's because of the incorrect format of headline of reference sequences. I used ';' instead of ',' after "tax:" identifier.

torognes commented 2 years ago

Good to know that you solved the problem.