PlantandFoodResearch / MCHap

Polyploid micro-haplotype assembly using Markov chain Monte Carlo simulation.
MIT License
18 stars 3 forks source link

Integer overflow caused by use of in8 for alleles in greedy_caller #117

Closed timothymillar closed 3 years ago

timothymillar commented 3 years ago

The greedy_caller function is used to find an initial genotype for the CallingMCMC.fit() method. In version 0.5.0 the greedy_caller function is hard-coded to use int8 which results in an overflow when there are > 128 unique haplotypes which does occur in some situations. The mcmc_sampler itself infers the genotype_trace dtype from the input (initial) genotype. The genotype trace itself is shape (n_chains n_steps ploidy) and hence is unlikely to be a problem so there is no reason not to use int64 here.

timothymillar commented 3 years ago

Fixed in #118