Closed michelledesontje closed 1 year ago
Actually I found out that I am not loosing them but the samples replicates are considered as 1 sample. For example: sample A1, sample A2 and sample A3 in the OTU table presented as sample 1. Is it an average or a sum? and why is it happening?
hello, your bug report does not contain enough information for us to understand and replicate your issue.
Ideally, we would need an minimal example showing how your sample replicate names are merged into a single name in vsearch's output. For example, here is a fasta file made of sequences from three samples A1, A2, and A3:
printf ">s1;size=2;sample=A1;\nA\n>s2;size=1;sample=A2;\nA\n>s3;size=4;sample=A3;\nA\n"
>s1;size=2;sample=A1;
A
>s2;size=1;sample=A2;
A
>s3;size=4;sample=A3;
A
Clusterize with vsearch (input sequences are identical, so we expect a single OTU):
printf ">s1;size=2;sample=A1;\nA\n>s2;size=1;sample=A2;\nA\n>s3;size=4;sample=A3;\nA\n" |\
vsearch \
--cluster_size - \
--minseqlength 1 \
--quiet \
--id 0.97 \
--strand plus \
--sizein \
--sizeout \
--relabel OTU_ \
--otutabout -
vsearch produces the expected OTU table:
#OTU ID A1 A2 A3
OTU_1 2 1 4
Hello everyone,
I am creating an OTU table using --cluster_size command. However, when I look at the output OTU table, half of the samples are missing there. Why could it be?
Thank you!