4.1 Multilingual language models are also anisotropy
Table 1 shows the isotropy scores (close to 1 = isotropy), but the scores of all languages are far from 1 (anisotropy).
As a result of labeling with word frequency obtained from wordfreq (https://pypi.org/project/wordfreq/), they found that high frequency words (yellow) are distributed in the center and low frequency words (black) are distributed outside.
→This result is different from monolingual language models (high freq.: outside, low freq.: center).
4.2 Isotropic vector space improves the performance also in multilingual language models
0. Paper
1. What is it?
They analyze isotropy in multilingual language models.
2. What is amazing compared to previous works?
They reveal that:
3. Where is the key to technologies and techniques?
Principal component based (#51 ): measures the magnitude of the dependence of a given set of vectors on each dimension
Cosine based (https://aclanthology.org/D19-1006/): average cosine similarity
4. How did evaluate it?
4.1 Multilingual language models are also anisotropy
Table 1 shows the isotropy scores (close to 1 = isotropy), but the scores of all languages are far from 1 (anisotropy).
As a result of labeling with word frequency obtained from wordfreq (https://pypi.org/project/wordfreq/), they found that high frequency words (yellow) are distributed in the center and low frequency words (black) are distributed outside. →This result is different from monolingual language models (high freq.: outside, low freq.: center).
4.2 Isotropic vector space improves the performance also in multilingual language models
5. Is there a discussion?
6. Which paper should read next?