GenotypeLikelihoods.calcNumLikelihoods(numAlleles, ploidy), which is the method used to calculate the length of the PL array given the number of alleles and the ploidy, is currently written using an inefficient recursive implementation:
private static final int calcNumLikelihoods(int numAlleles, int ploidy) {
if (numAlleles == 1) {
return 1;
} else if (ploidy == 1) {
return numAlleles;
} else {
int acc = 0;
for(int k = 0; k <= ploidy; ++k) {
acc += calcNumLikelihoods(numAlleles - 1, ploidy - k);
}
return acc;
}
}
However, this method has an easy closed-form combinatorics solution using the formula for selection with replacement:
So (for example), for numAlleles = 5 and ploidy = 2, it is just (5 + 2 - 1) choose 2 = 6 choose 2 = 15
(Thanks to @fleharty and @kachulis for their help in deriving this!)
Since this method is called every time we create a PL array (which is very frequently!), it seems worth it to get rid of the recursion in favor of a one-line call out to a standard combinatorics function.
GenotypeLikelihoods.calcNumLikelihoods(numAlleles, ploidy)
, which is the method used to calculate the length of the PL array given the number of alleles and the ploidy, is currently written using an inefficient recursive implementation:However, this method has an easy closed-form combinatorics solution using the formula for selection with replacement:
So (for example), for numAlleles = 5 and ploidy = 2, it is just
(5 + 2 - 1) choose 2
=6 choose 2
= 15(Thanks to @fleharty and @kachulis for their help in deriving this!)
Since this method is called every time we create a PL array (which is very frequently!), it seems worth it to get rid of the recursion in favor of a one-line call out to a standard combinatorics function.