magicDGS / popgenlib

Population Genetics Java Library
MIT License
0 stars 0 forks source link

Use commons-math3 BinomialDistribution in NucleotideDiversityPoolSeq #23

Open magicDGS opened 7 years ago

magicDGS commented 7 years ago

In the first implementation of nucleotide diversity correction for Pool-Seq (https://github.com/magicDGS/popgenlib/pull/19), the countProbability(final int readCount, final int coverage, final int poolSize, final int poolCount) method is implementing the binomial distribution probability of readCount. In commons-math3, there is a BinomialDistribution implementation, which gives this probability with a faster algorithm.

Nevertheless, the BinomialDistribution class requires to be instantiated with the parameters to call the probabilty function, and that may have a performance penalty by class instantiation because we compute the probability often and for several distributions. We should check if we can substitute our method by new BinomialDistribution(coverage, (double) k / poolSize).probability(readCount) to clean our implementation and rely on a faster algorithm.

magicDGS commented 7 years ago

Hey @JPinzon01, can you have a look to this? I don't know too much about performance tests...