The text clustering step described in #19 is likely to yield some noisy clusters which we would like to remove from analysis before the reassignment stage. We could explore some options to do this here:
Calculate silhouette scores for clusters and remove below a certain threshold (which?)
Identify salient terms and analyse their pairwise similarity using word2vec or something like that\
This could lead to the removal of novel and crossover sectors.
The text clustering step described in #19 is likely to yield some noisy clusters which we would like to remove from analysis before the reassignment stage. We could explore some options to do this here: