Closed xuanji2017 closed 2 years ago
Great question! So the genotyping step applies a filter to clusters. It's called "--filter-clusters-inferred-assembly". This removes clusters that were never identified from an assembly, meaning they were only found in the reference. You can remove this filter if you make your own custom snakemake pipeline.
Hi, Thank you to make this great tool. I finally get the 03. results folder. But when I check the number of unique clusters in "01.clusterseq.GCA_000210735.tsv", I found the number is not the same as the number of clusters in 03.summarize.GCA_000210735.clusters.tsv. For example, 1331 vs 1234. The number of groups is also the same case. Besides, the number of unique inferred_seq in "01.clusterseq.GCA_000210735.tsv" is also not the same as the number of contigs in "04.makefasta.GCA_000210735.all_seqs.fna". Do you have any explanation for this? Thanks a lot!