harrispopgen / mutyper

Ancestral k-mer mutation types for SNP data
https://harrispopgen.github.io/mutyper/
MIT License
7 stars 3 forks source link

numpy implementation for spectra #38

Closed Lukez-pi closed 1 year ago

Lukez-pi commented 1 year ago

Used numpy methods of cyvcf2 (mentioned in issue #108 of cyvcf2) to improve the speed of spectra. Running this new implementation on a test dataset reduced the run time from 13 minutes down to 3 minutes.

In addition, also slightly modified the snps.vcf file under tests/test_data by setting some variants to phased to prevent accidentally adding phased status (1 for phased) and messing up spectra count.

Lukez-pi commented 1 year ago

You are right about the changes in test vcf, prior to the change, all variants were unphased (thus phasing indicator = 0), "accidentally" counting phasing indicators will still pass all test cases for spectra