Idea to Speed Similarity GMM Step

When using GMMs, KINC can get extremely slow with large sample sizes (i.e. thousands). However, it is probably not necessary to use all of the samples to establish "modes". I propose the following

Rather than use all samples, use a randomly selected subset. Perhaps this could be as small as 30 samples? Using 30 should allow GMMs to run quickly.
Perform multiple GMM iterations with different randomly selected samples. This would allow for different modes to be identified in different iterations.
Select all non-overlapping clusters as the final set.

Just an idea....

SystemsGenetics / KINC

Idea to Speed Similarity GMM Step #165