They proposed an attention model for temporal analysis.
2. What is amazing compared to previous works?
Their attention mechanism achieves state-of-the-art performance in SemEval-2020 Task 1.
3. Where is the key to technologies and techniques?
Theoretically, each token in an input sequence could have its own time point.
From this idea, they proposed Temporal Attention to generate time-specific word vectors using time vectors Xt and their weights Wt.
Temporal Attention can be calculated as follows:
where $$T = X^t Wt$$
0. Paper
my literature review (Japanese) is here
1. What is it?
They proposed an attention model for temporal analysis.
2. What is amazing compared to previous works?
Their attention mechanism achieves state-of-the-art performance in SemEval-2020 Task 1.
3. Where is the key to technologies and techniques?
From this idea, they proposed Temporal Attention to generate time-specific word vectors using time vectors Xt and their weights Wt. Temporal Attention can be calculated as follows: where $$T = X^t Wt$$
4. How did evaluate it?
From this Table, Temporal Attention outperforms strong baselines (SGNS+alignment, BERT+fine-tuning).
5. Is there a discussion?
From this table, they hypothesized that to understand time there is no need to use extremely large models.
6. Which paper should read next?