Silhouette calculation on large dataset

This solution calculates pairwise distances in real-time rather than pre-calculating and storing the distance matrix in the memory. Therefore it is memory efficient, despite twice as slow (because each pair is calculated twice). It is necessary for handling very large datasets.

In the following example, calculating Silhouette coefficients of ~250k data points assigned to ~250 clusters took about 20 min.

@pavia27 @nujinuji