Closed ireneortega closed 1 year ago
This is probably fine, it's just ignoring any sites with multiple alternate alleles in the distance calculation. If you want to include them, you can split them into separate sites using bcftools
, see this part of the docs: https://pyseer.readthedocs.io/en/master/usage.html?highlight=split#snps-and-indels
When I run the command:
similarity_pyseer --vcf core.vcf samples.txt > kinship_matrix.txt
I got several messages of the type:
Multiple alleles chromosome_position. Skipping
Should I be worried about them? I got the
core.vcf file
withsnippy-core
instead of the roary gene presence/absence matrix from roary as you suggested. Did I did correctly?