a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation #196

Open a1da4 opened 3 years ago

a1da4 commented 3 years ago

0. Paper

authors: Zi Yin, Vin Sachidananda, Balaji Prabhakar paper: arxiv

1. What is it?

They propose a global anchor method, regularizing a local anchor method.

2. What is amazing compared to previous works?

They show that the global anchor method is equivalent to the alignment methods. Moreover, their method can be adapted easily and represent changes in different times and domains.

3. Where is the key to technologies and techniques?

3.0 Local anchor method (previous work)

Calculate inner products of a target word i with l anchors in word vector spaces E and F. スクリーンショット 2021-08-13 2 18 36

Then, the norm of the difference of those vectors is computed. スクリーンショット 2021-08-13 2 19 27

3.1 Global anchor method

They consider that all words in the vocabulary as anchors. スクリーンショット 2021-08-13 2 21 43

4. How did evaluate it?

4.1 trajectories of anchor difference

From Figure 1, linguistic variation increases with respect to |i − j|, which is expected in Google Books Ngram. (This result is also shown in other corpora, Reddit, arxiv, and COHA). スクリーンショット 2021-08-13 2 25 08

Moreover, Figure 2 shows that there is a clear upward trend after 1944. スクリーンショット 2021-08-13 2 28 25

4.2 trajectory and domain similarity of corpora

Using the distance matrix such as EE^T or FF^T, they propose a Laplacian Embedding. The laplacian embedding creates a document-based vector from the difference matrix for each corpus E or F. スクリーンショット 2021-08-13 2 30 56

Figures 3(a) and 3(b) shows that their laplacian embedding capture the trajectory and domain similarity. スクリーンショット 2021-08-13 2 34 11 スクリーンショット 2021-08-13 2 34 20

5. Is there a discussion?

6. Which paper should read next?

a1da4 commented 2 years ago

213

using similarity matrices for semantic drift of emoji