I want to construct a personalized database. However, from what I understood, I can go to SILVA and download the NR99 (clustered at 99 % identity) or the Ref (not clustered).
Usually, I would just download the Ref and then use Vsearch to cluster the sequences at 95 % identity. However, the function matam_db_preprocessing.py also does some clustering to the provided sequence file. So my question is the following: if I run the above mentioned function on the clustered database will it re-cluster these sequences ? If so, can we just provide the unclustered database to MATAM and perform the user-specified identity clustering?
I want to construct a personalized database. However, from what I understood, I can go to SILVA and download the NR99 (clustered at 99 % identity) or the Ref (not clustered). Usually, I would just download the Ref and then use Vsearch to cluster the sequences at 95 % identity. However, the function matam_db_preprocessing.py also does some clustering to the provided sequence file. So my question is the following: if I run the above mentioned function on the clustered database will it re-cluster these sequences ? If so, can we just provide the unclustered database to MATAM and perform the user-specified identity clustering?