nicholas-leonard / equanimity

Experimental research for distributed conditional computation
4 stars 0 forks source link

Multi-Hierarchical Clustering of Words #43

Closed nicholas-leonard closed 10 years ago

nicholas-leonard commented 10 years ago

The experts will share a multi-hierarchical softmax layer. Multiple non-overlapping trees will be generated before training using similarity graph clustering. No further clustering will be performed during training.

nicholas-leonard commented 10 years ago

We create a table mapping each word to its context words and their counts by unnesting the table of sentence arrays and performing a group by. We use these bags of words to generate similarity arrows. We use these arrows to cluster the words. We generate a new table of arrows without the within-cluster arrows from the previous one. We perform another clustering, and so forth and so on, until no arrows are left.

nicholas-leonard commented 10 years ago

Finished primary word hierarchy.

nicholas-leonard commented 10 years ago

Finished secondary word hierarchy.

nicholas-leonard commented 10 years ago

Started tertiary word hierarchy.