mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

Multiple alleles similarity_pyseer #231

Closed ireneortega closed 1 year ago

ireneortega commented 1 year ago

When I run the command:

similarity_pyseer --vcf core.vcf samples.txt > kinship_matrix.txt

I got several messages of the type: Multiple alleles chromosome_position. Skipping

Should I be worried about them? I got the core.vcf file with snippy-core instead of the roary gene presence/absence matrix from roary as you suggested. Did I did correctly?

johnlees commented 1 year ago

This is probably fine, it's just ignoring any sites with multiple alternate alleles in the distance calculation. If you want to include them, you can split them into separate sites using bcftools, see this part of the docs: https://pyseer.readthedocs.io/en/master/usage.html?highlight=split#snps-and-indels