tskit-dev / pyslim

Tools for dealing with tree sequences coming to and from SLiM.
MIT License
27 stars 23 forks source link

Error in VCF output: More than 9 alleles not currently supported #235

Closed mariharris closed 2 years ago

mariharris commented 2 years ago

Hi. I'm trying to get a VCF file output after preforming recapitation-simplification-mutation but I keep geting this error: More than 9 alleles not currently supported. Please open an issue on GitHub if this limitation affects you. I run a selective sweep from recurrent mutations in SLiM and then recapitate and add neutral mutations using pyslim. It seems that the site with more than 9 alleles corresponds to the selected site, where beneficial mutations were introduced in SLiM.

petrelharp commented 2 years ago

Thanks for the report! Hm, let's see - that code is in tskit, so maybe we should move this issue over there... but first: what would you like the VCF file to look like for your use case? Will you want to distinguish all of your various >9 alleles in the resulting file? The ancestral and derived states are stored by SLiM as comma-separated lists of mutation IDs (see explanation here) - is this what you want? You might also have a look at convert_alleles for another option.

mariharris commented 2 years ago

I don't need to distinguish the various alleles in the file. I think the problem was that, since I am adding recurrent mutations at the same position, the mutations are stores as distinct alleles. I used convert_alleles and I was able to output a VCF file. Thanks!

petrelharp commented 2 years ago

Ok - so if convert_alleles works for your use case, I'll close this? Re-open if not. Thanks!