EMNLP 2020 | Probing Pretrained Language Models for Lexical Semantics

This paper investigates how good the token representations from PLM are for lexical semantics.

Lexical tasks: LSIM: word pair similarity spearman correlation; WA: a+b=c+d; BLI: bilingual word pairs mapping CLIR: cross-lingual retrieval RELP: word pair relation classification, for example, whether "state" is a subset of "country".

They compares the following setups:

includes: monolingual/multilingual; single word encoding/average contextual word encoding across different contexts; encodes special tokens or not; layer average.

They find that mono is better than multi; context is important; special tokens are useless; layer average is effective but not for all tasks.

richardbaihe / paperreading

EMNLP 2020 | Probing Pretrained Language Models for Lexical Semantics #59