Closed npechl closed 5 months ago
Hi Nikos,
Thanks for your interest in using this software.
For the selection of reference sequences, the important thing is that they are trimmed precisely to the amplicon you are sequencing (including the primer region). Beyond that, it's theoretically ideal to use a set that are phylogenetically diverse so that HMMs are optimally representative from the start. In practice, however, I've not seen huge effects between different sets of input references.
For question 2, this is not necessarily unexpected. Some input sequences are junk and, even with non-erroneous sequences, there are no curation methods with 100 percent sensitivity for identifying the amplicon region of interest, especially for a more poorly conserved region like ITS1. One thing I'd recommend is to play around with -is, -cs and perhaps -e options.
Best, Rodney
Thank you for your quick response. I ll keep experimenting with MetaCurator
and come back to this in case I have further questions.
Hi @RTRichar,
Thank you for developing MetaCurator. I am using MetaCurator to build a reference database on ITS1, and I have a couple of questions.
.fasta
file an appropriate way to follow?TestMetaCurator
). How can I ensure that these species are accurately represented in the final curated sequences and taxonomy?Thank you in advance for your time!
Nikos