a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: Specialising Word Vectors for Lexical Entailment #212

Open a1da4 opened 2 years ago

a1da4 commented 2 years ago

0. Paper

@inproceedings{vulic-mrksic-2018-specialising, title = "Specialising Word Vectors for Lexical Entailment", author = "Vuli{\'c}, Ivan and Mrk{\v{s}}i{\'c}, Nikola", booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)", month = jun, year = "2018", address = "New Orleans, Louisiana", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N18-1103", doi = "10.18653/v1/N18-1103", pages = "1134--1145", abstract = "We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model.", }

1. What is it?

They propose a new postprocessing approach to consider hypernymy.

2. What is amazing compared to previous works?

3. Where is the key to technologies and techniques?

3.1 Model

They represent:

In training, they prepare positive/negative samples x (in B) and t (in T).

Total loss is defined:

3.2 Metrics

Previous methods use cosine similarity (angle), but this model considers angle and norm. In this paper, they propose a new metric considering both of them.

スクリーンショット 2021-09-29 3 13 59

4. How did evaluate it?

Hypernym tasks

5. Is there a discussion?

6. Which paper should read next?

a1da4 commented 2 years ago


+meronymy fine-tune main vector space → each relation is represented in one subspace