a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: Diachronic Sense Modeling with Deep Contextualized Word Embeddings: An Ecological View #89

Open a1da4 opened 4 years ago

a1da4 commented 4 years ago

0. Paper

@inproceedings{hu-etal-2019-diachronic, title = "Diachronic Sense Modeling with Deep Contextualized Word Embeddings: An Ecological View", author = "Hu, Renfen and Li, Shen and Liang, Shichen", booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics", month = jul, year = "2019", address = "Florence, Italy", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/P19-1379", doi = "10.18653/v1/P19-1379", pages = "3899--3908", abstract = "Diachronic word embeddings have been widely used in detecting temporal changes. However, existing methods face the meaning conflation deficiency by representing a word as a single vector at each time period. To address this issue, this paper proposes a sense representation and tracking framework based on deep contextualized embeddings, aiming at answering not only what and when, but also how the word meaning changes. The experiments show that our framework is effective in representing fine-grained word senses, and it brings a significant improvement in word change detection task. Furthermore, we model the word change from an ecological viewpoint, and sketch two interesting sense behaviors in the process of language evolution, i.e. sense competition and sense cooperation.", }

1. What is it?

They try to use a contextual word embedding (e.g. BERT) in historical semantic change.

2. What is amazing compared to previous works?

They use a contextual word embedding (BERT) to obtain more than two embeddings in one word. Their model can predict how the semantic is changed.

スクリーンショット 2020-05-23 23 21 04

3. Where is the key to technologies and techniques?

3.1 Obtain semantic representation"s" in each word

They use the Oxford dictionary to obtain word meanings and its sentences.

3.2 Sense identification in test data (COHA)

In the test data (COHA), they compute word representations. To identify the semantic of the word in the data, they use cosine similarity with semantic representations e(s1), ..., e(sj), ..., e(sJ). The nearest meaning of representation is assigned for the word.

4. How did evaluate it?

4.1 Word sense identification

They obtain other sentences from the Oxford dictionary. The model predict the semantic of the target word.

スクリーンショット 2020-05-23 23 16 04

Their model(BERT) missed some words, these can be divided into two types:

4.2 Word meaning change

They use the test set from Gulordava and Baroni (2011). Each word is annotated on 4-point scale (0 is not change, 3 is significantly change) They computed the correlation between the novelty score and human score. The novelty score is defined as the proportion of usages of semantic sj in the focus corpus and reference corpus.

スクリーンショット 2020-05-23 23 25 22 スクリーンショット 2020-05-23 23 25 37

Their method achieved the highest score in previous works.

スクリーンショット 2020-05-23 23 27 41

5. Is there a discussion?

6. Which paper should read next?

a1da4 commented 4 years ago

There are more works using BERT in semantic change detection

88

69