a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: Visualizing and Measuring the Geometry of BERT #244

Open a1da4 opened 1 year ago

a1da4 commented 1 year ago

0. Paper

paper: arxiv

1. What is it?

They analyze contextualized word representations from BERT.

2. What is amazing compared to previous works?

3. Where is the key to technologies and techniques?

To analyze the grammatical information, they use model-wise attention vector. スクリーンショット 2023-02-07 10 53 17

4. How did evaluate it?

4.1 Grammatical information

Previous work found that the l2 distance between contextualized embeddings of BERT captures a dependency tree. スクリーンショット 2023-02-06 23 29 30

Figure 3 shows the average l2 distance of model-wise attention vector between two words with a given dependency label detects:

スクリーンショット 2023-02-07 10 59 39

4.2 Internal representations

Figure 4 shows that BERT embeddings capture semantic information. スクリーンショット 2023-02-06 23 29 52

5. Is there a discussion?

6. Which paper should read next?

a1da4 commented 1 year ago

245 multilingual