a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: A Probabilistic Model for Learning Multi-Prototype Word Embeddings #203

Open a1da4 opened 2 years ago

a1da4 commented 2 years ago

0. Paper

@inproceedings{tian-etal-2014-probabilistic, title = "A Probabilistic Model for Learning Multi-Prototype Word Embeddings", author = "Tian, Fei and Dai, Hanjun and Bian, Jiang and Gao, Bin and Zhang, Rui and Chen, Enhong and Liu, Tie-Yan", booktitle = "Proceedings of {COLING} 2014, the 25th International Conference on Computational Linguistics: Technical Papers", month = aug, year = "2014", address = "Dublin, Ireland", publisher = "Dublin City University and Association for Computational Linguistics", url = "https://aclanthology.org/C14-1016", pages = "151--160", }

1. What is it?

They propose a SkipGram-based probabilistic model for multi-prototype word vectors.

2. What is amazing compared to previous works?

Previous methods (#200, #202) are sensitive to the clustering algorithm. In this paper, they propose a method that has fewer parameters than existing research and does not depend on clustering.

3. Where is the key to technologies and techniques?

Their method is based on the Skip Gram model:

スクリーンショット 2021-09-15 17 18 51

To consider the multi-prototype, they define Nw representations in each word w.

スクリーンショット 2021-09-15 17 20 25

The softmax term can be faster using one of the efficient methods, the Hierarchical Softmax Tree.

スクリーンショット 2021-09-15 17 23 16

This model is trained by EM algorithm:

4. How did evaluate it?

4.1 Parameters

From Table 1, their model has fewer parameters than the previous state-of-the-art model (EHModel, #202)

スクリーンショット 2021-09-15 17 26 35

4.2 Nearest neighbors

Number of prototypes (top 7000 words): 10 From Table 2, their model can represent each sense of words in each prototype.

スクリーンショット 2021-09-15 17 28 08

4.3 Word Similarity in Context (#202)

From Table 3, their models (Model_M: MaxSim, Model_W: WeightedSim) achieve comparable results than the state-of-the-art model (EHModel).

スクリーンショット 2021-09-15 17 30 10

5. Is there a discussion?

6. Which paper should read next?

a1da4 commented 2 years ago

207

Multi-Sense Skip-Gram