Open a1da4 opened 2 years ago
They try to answer these research questions:
Training starts from the last checkpoint of pre-trained BERT model
HistBERT models outperform the original BERT model.
0. Paper
1. What is it?
They try to answer these research questions:
2. What is amazing compared to previous works?
3. Where is the key to technologies and techniques?
Training starts from the last checkpoint of pre-trained BERT model
4. How did evaluate it?
HistBERT models outperform the original BERT model.
5. Is there a discussion?
6. Which paper should read next?