a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: Time-Aware Language Models as Temporal Knowledge Bases #239

Open a1da4 opened 2 years ago

a1da4 commented 2 years ago

0. Paper

article{dhingra-etal-2022-time, title = "Time-Aware Language Models as Temporal Knowledge Bases", author = "Dhingra, Bhuwan and Cole, Jeremy R. and Eisenschlos, Julian Martin and Gillick, Daniel and Eisenstein, Jacob and Cohen, William W.", journal = "Transactions of the Association for Computational Linguistics", volume = "10", year = "2022", address = "Cambridge, MA", publisher = "MIT Press", url = "https://aclanthology.org/2022.tacl-1.15", doi = "10.1162/tacl_a_00459", pages = "257--273", abstract = "Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. However, most language models (LMs) are trained on snapshots of data collected at a specific moment in time. This can limit their utility, especially in the closed-book setting where the pretraining corpus must contain the facts the model should memorize. We introduce a diagnostic dataset aimed at probing LMs for factual knowledge that changes over time and highlight problems with LMs at either end of the spectrum{---}those trained on specific slices of temporal data, as well as those trained on a wide range of temporal data. To mitigate these problems, we propose a simple technique for jointly modeling text with its timestamp. This improves memorization of seen facts from the training time period, as well as calibration on predictions about unseen facts from future time periods. We also show that models trained with temporal context can be efficiently {``}refreshed{''} as new data arrives, without the need for retraining from scratch.", }

1. What is it?

They analyze the degradation of pre-trained language models by time elapsed.

2. What is amazing compared to previous works?

3. Where is the key to technologies and techniques?

TempLAMA

スクリーンショット 2022-09-12 9 42 13

They collect sentences from Wikipedia snapshots.

Additional training strategy

スクリーンショット 2022-09-12 9 45 09

Additional training strategies are started from the available T5 checkpoint.

4. How did evaluate it?

スクリーンショット 2022-09-12 9 50 50

Table 2 shows that the performance improvement (Uniform < Yearly < Temporal) comes from the ability to clearly separate time information.

スクリーンショット 2022-09-12 9 59 19

Figure 4 (change the input time information to the future) shows that Uniform and Temporal models seem to expect the future properly.

スクリーンショット 2022-09-12 10 01 50

Figure 5 shows that re-train using only future data (alpha = 1) causes forgetting past information. However, re-train using mixed (recent and future) data equally (alpha = 0.5), models can adapt the future data without forgetting.

5. Is there a discussion?

6. Which paper should read next?

a1da4 commented 2 years ago

232 #219 Related Work