They propose a metric to evaluate the vocabulary/structural/semantic drift between train and test data.
2. What is amazing compared to previous works?
They separate the data drift into vocabulary, structure, and semantics.
Their metirc can evaluate the example-base drift.
3. Where is the key to technologies and techniques?
vocabulary drift: log-perplexity of a unigram language model
structural drift: cross-entropy of a POS 5-gram model (using spaCy to annotate POS tags)
semantic drift: average of semantic change scores for all words w in the target example (one sentence from test set)
where LSC(w) is the average pairwise cosine distance between the example and the train set
To calculate contextual word vectors, they used a pre-trained RoBERTa model.
4. How did evaluate it?
task: predict the performance of fine-tuned RoBERTa models on (in-domain / out-of-domain) classification tasks
0. Paper
1. What is it?
They propose a metric to evaluate the vocabulary/structural/semantic drift between train and test data.
2. What is amazing compared to previous works?
They separate the data drift into vocabulary, structure, and semantics. Their metirc can evaluate the example-base drift.
3. Where is the key to technologies and techniques?
vocabulary drift: log-perplexity of a unigram language model![スクリーンショット 2023-06-02 15 10 06](https://github.com/a1da4/paper-survey/assets/45454055/89f6626e-6ac2-4818-bc2d-95079961ee09)
structural drift: cross-entropy of a POS 5-gram model (using spaCy to annotate POS tags)![スクリーンショット 2023-06-02 15 10 54](https://github.com/a1da4/paper-survey/assets/45454055/0e93e411-927b-4ffe-a86c-701d96d6dccc)
semantic drift: average of semantic change scores for all words w in the target example (one sentence from test set)
where LSC(w) is the average pairwise cosine distance between the example and the train set
To calculate contextual word vectors, they used a pre-trained RoBERTa model.
4. How did evaluate it?
Table 1 shows that their combination metrics (vocabulary, structural, and semantic drift) achieve the best performance.
5. Is there a discussion?
6. Which paper should read next?