a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology #5

Open a1da4 opened 5 years ago

a1da4 commented 5 years ago

0. Paper

@inproceedings{xu-etal-2019-treat, title = "Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology", author = "Xu, Yang and Zhang, Jiasheng and Reitter, David", booktitle = "Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change", month = aug, year = "2019", address = "Florence, Italy", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/W19-4717", doi = "10.18653/v1/W19-4717", pages = "136--145", }

1. What is it?

In this paper, the authors used word-embeddings to capture the meaning of subwords. They tried 5 languages(1 East-Asian; Chinese, English, and 4 Europian; French, German, Italian, Spanish).

2. What is amazing compared to previous studies?

The authors proposed new models to characterize the semantic weights of subword units. In previous work, there are some models,

However, in these models, the equations treat words and subwords equally.

3. Where is the key to technologies and techniques?

They proposed the Dynamic Subword-incorporated embedding model (DSE).

スクリーンショット 2019-09-04 12 02 08

This model used the semantic weight parameter h, which is a learnable parameter. On the other hand, 1-h means the weight parameter of the subwords. They used 2 models that,

and the target vector x is calculated by the equation of these methods.

スクリーンショット 2019-09-04 12 10 20

4. How did validate it?

They calculated the semantic weight parameter h when the first-appearance-year was changed.

5. Is there a discussion?

In Chinese,

characters carry more semantic weight in older Chinese words than in newer Chinese words.

スクリーンショット 2019-09-04 16 04 16

6. Which paper should read next?

The problem with this approach is that the learned word vectors are subject to random noise due to corpus size. This paper addresses this problem with a probabilistic variation of word2vec model. Dynamic word embeddings

a1da4 commented 5 years ago

#11 Dynamic word embeddings