@inproceedings{wallace-etal-2019-nlp,
title = "Do {NLP} Models Know Numbers? Probing Numeracy in Embeddings",
author = "Wallace, Eric and
Wang, Yizhong and
Li, Sujian and
Singh, Sameer and
Gardner, Matt",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1534",
doi = "10.18653/v1/D19-1534",
pages = "5307--5315",
}
1. What is it?
They focus on the question, "Is this(treat numbers the same way as other tokens) enough to capture numeracy?"
2. What is amazing compared to previous studies?
They prove token embedding methods on
synthetic list maximum: search the max value in the number list.
number encoding: string into a number. e.g. "nine" -> "9"
addition tasks: e.g. "1", "5" -> "6"
3. Where is the key to technologies and techniques?
proposed evaluation model
Pretrained Embedder part can use several models.
4. How did validate it?
evaluate three tasks,
synthetic list maximum: search the max value in the number list.
number encoding: string into a number. e.g. "nine" -> "9"
addition tasks: e.g. "1", "5" -> "6"
As a result, pre-trained token representations naturally encode numeracy.
CNN and ELMo are good.
0. Paper
@inproceedings{wallace-etal-2019-nlp, title = "Do {NLP} Models Know Numbers? Probing Numeracy in Embeddings", author = "Wallace, Eric and Wang, Yizhong and Li, Sujian and Singh, Sameer and Gardner, Matt", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D19-1534", doi = "10.18653/v1/D19-1534", pages = "5307--5315", }
1. What is it?
They focus on the question, "Is this(treat numbers the same way as other tokens) enough to capture numeracy?"
2. What is amazing compared to previous studies?
They prove token embedding methods on
3. Where is the key to technologies and techniques?
proposed evaluation model
Pretrained Embedder part can use several models.
4. How did validate it?
evaluate three tasks,
As a result, pre-trained token representations naturally encode numeracy. CNN and ELMo are good.
5. Is there a discussion?
Embeddings may capture...
6. Which paper should read next?
In ACL2019, same paper was published.