Reading: The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation

0. Paper

authors: Zi Yin, Vin Sachidananda, Balaji Prabhakar paper: arxiv

1. What is it?

They propose a global anchor method, regularizing a local anchor method.

2. What is amazing compared to previous works?

They show that the global anchor method is equivalent to the alignment methods. Moreover, their method can be adapted easily and represent changes in different times and domains.

3. Where is the key to technologies and techniques?

3.0 Local anchor method (previous work)

Calculate inner products of a target word i with l anchors in word vector spaces E and F. スクリーンショット 2021-08-13 2 18 36

Then, the norm of the difference of those vectors is computed. スクリーンショット 2021-08-13 2 19 27

✅ easy to use
❌ anchor words need to be manually selected

3.1 Global anchor method

They consider that all words in the vocabulary as anchors. スクリーンショット 2021-08-13 2 21 43

✅ easy to use
✅ does not need to select anchor words manually
✅ works between vectors with different numbers of dimensions (not in the alignment methods)

4. How did evaluate it?

4.1 trajectories of anchor difference

From Figure 1, linguistic variation increases with respect to |i − j|, which is expected in Google Books Ngram. (This result is also shown in other corpora, Reddit, arxiv, and COHA). スクリーンショット 2021-08-13 2 25 08

Moreover, Figure 2 shows that there is a clear upward trend after 1944. スクリーンショット 2021-08-13 2 28 25

4.2 trajectory and domain similarity of corpora

Using the distance matrix such as EE^T or FF^T, they propose a Laplacian Embedding. The laplacian embedding creates a document-based vector from the difference matrix for each corpus E or F. スクリーンショット 2021-08-13 2 30 56

Figures 3(a) and 3(b) shows that their laplacian embedding capture the trajectory and domain similarity. スクリーンショット 2021-08-13 2 34 11 スクリーンショット 2021-08-13 2 34 20

a1da4 / paper-survey