danielbis / word2sense

LSTM for creating contextualized word2vec vectors, trained for minimizing the distance between synonyms on WordNet tagged corpus
0 stars 0 forks source link

Extend dataset to include Brown/Semcor Corpus #12

Open danielbis opened 5 years ago

danielbis commented 5 years ago

We want to be able to train the model on a larger corpus, therefore semcor should be used.

Potential issues:

Solutions:

  1. Extend the sense2id, sense2related mappings by just appending the new wordnet_sense keys to appropriate ids and related words. Notice that wordnet sense tags may be the same as our converted on_sense tags.