Reading: Do NLP Models Know Numbers? Probing Numeracy in Embeddings

0. Paper

@inproceedings{wallace-etal-2019-nlp, title = "Do {NLP} Models Know Numbers? Probing Numeracy in Embeddings", author = "Wallace, Eric and Wang, Yizhong and Li, Sujian and Singh, Sameer and Gardner, Matt", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D19-1534", doi = "10.18653/v1/D19-1534", pages = "5307--5315", }

1. What is it?

They focus on the question, "Is this(treat numbers the same way as other tokens) enough to capture numeracy?"

2. What is amazing compared to previous studies?

They prove token embedding methods on

synthetic list maximum: search the max value in the number list.
number encoding: string into a number. e.g. "nine" -> "9"
addition tasks: e.g. "1", "5" -> "6"

3. Where is the key to technologies and techniques?

proposed evaluation model

Pretrained Embedder part can use several models.

4. How did validate it?

evaluate three tasks,

synthetic list maximum: search the max value in the number list.
number encoding: string into a number. e.g. "nine" -> "9"
addition tasks: e.g. "1", "5" -> "6"

As a result, pre-trained token representations naturally encode numeracy. CNN and ELMo are good.

5. Is there a discussion?

Embeddings may capture...

the size of objects
speed of vehicles
commonsense

6. Which paper should read next?

In ACL2019, same paper was published.

Exploring Numeracy in Word Embeddings

a1da4 / paper-survey