a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: An Isotropy Analysis in the Multilingual BERT Embedding Space #248

Open a1da4 opened 1 year ago

a1da4 commented 1 year ago

0. Paper

1. What is it?

They analyze isotropy in multilingual language models.

2. What is amazing compared to previous works?

They reveal that:

3. Where is the key to technologies and techniques?

4. How did evaluate it?

4.1 Multilingual language models are also anisotropy

スクリーンショット 2023-02-10 21 16 18

Table 1 shows the isotropy scores (close to 1 = isotropy), but the scores of all languages are far from 1 (anisotropy).

スクリーンショット 2023-02-10 21 15 55

As a result of labeling with word frequency obtained from wordfreq (https://pypi.org/project/wordfreq/), they found that high frequency words (yellow) are distributed in the center and low frequency words (black) are distributed outside. →This result is different from monolingual language models (high freq.: outside, low freq.: center).

4.2 Isotropic vector space improves the performance also in multilingual language models

スクリーンショット 2023-02-10 21 34 28

5. Is there a discussion?

6. Which paper should read next?