Explore cache of Watterson's denominator for high performance

For computing θ_W, the denominator is computed always from scratch. Even if the computation will be approximated, when performing this for large number of samples and the whole genome, this computation is repeated again and again.

One way of improving the performance is maintain a cache, with the following characteristics

Cache framework for keeping the caching efficient enough. My first idea is to use Guava.
Container class for the computation. We do not want to have statically filled cache opened in every class, but an object to allow user-customized caching.
Unit tests should be added to check that the computation is correct while comparing to the original implementation.
Unit test for performance. We have to figure out how to run with TestNG some benchmarking test, to use both the cached and the direct computation and assess that the cached one is faster (and this is not broken).

magicDGS / popgenlib

Explore cache of Watterson's denominator for high performance #15