rafguns / textual-coherence

Clustering coherence by Jensen-Shannon divergence
0 stars 0 forks source link

random_cluster should use Generator.choice #1

Closed rafguns closed 3 years ago

rafguns commented 3 years ago

The numpy documentation mentions

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

So random_cluster should use code a bit like this (untested):

rng = np.random.default_rng()
return rng.choice(all_docs, size=cluster_size)

This may also be a good opportunity to introduce an optional random seed.