@inproceedings{vulic-mrksic-2018-specialising,
title = "Specialising Word Vectors for Lexical Entailment",
author = "Vuli{\'c}, Ivan and
Mrk{\v{s}}i{\'c}, Nikola",
booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)",
month = jun,
year = "2018",
address = "New Orleans, Louisiana",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/N18-1103",
doi = "10.18653/v1/N18-1103",
pages = "1134--1145",
abstract = "We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model.",
}
1. What is it?
They propose a new postprocessing approach to consider hypernymy.
2. What is amazing compared to previous works?
synonymy (#210) and antonymy (#211): symmetric relations
hypernymy: asymmetric relation
3. Where is the key to technologies and techniques?
3.1 Model
They represent:
symmetric relations: angle
asymmetric relations: norm
In training, they prepare positive/negative samples x (in B) and t (in T).
synonym
antonym
vector preservation
hypernym: high- low- level distance j is given
Total loss is defined:
hypernym has two functions Att() and LE(): word pairs that are in a high-level/low-level relationship should have a high degree of similarity
3.2 Metrics
Previous methods use cosine similarity (angle), but this model considers angle and norm.
In this paper, they propose a new metric considering both of them.
4. How did evaluate it?
Hypernym tasks
Nguyen2017: ad-hoc based approach
their method achieves state-of-the-art performance
0. Paper
@inproceedings{vulic-mrksic-2018-specialising, title = "Specialising Word Vectors for Lexical Entailment", author = "Vuli{\'c}, Ivan and Mrk{\v{s}}i{\'c}, Nikola", booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)", month = jun, year = "2018", address = "New Orleans, Louisiana", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N18-1103", doi = "10.18653/v1/N18-1103", pages = "1134--1145", abstract = "We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation. By injecting external linguistic constraints (e.g., WordNet links) into the initial vector space, the LE specialisation procedure brings true hyponymy-hypernymy pairs closer together in the transformed Euclidean space. The proposed asymmetric distance measure adjusts the norms of word vectors to reflect the actual WordNet-style hierarchy of concepts. Simultaneously, a joint objective enforces semantic similarity using the symmetric cosine distance, yielding a vector space specialised for both lexical relations at once. LEAR specialisation achieves state-of-the-art performance in the tasks of hypernymy directionality, hypernymy detection, and graded lexical entailment, demonstrating the effectiveness and robustness of the proposed asymmetric specialisation model.", }
1. What is it?
They propose a new postprocessing approach to consider hypernymy.
2. What is amazing compared to previous works?
3. Where is the key to technologies and techniques?
3.1 Model
They represent:
In training, they prepare positive/negative samples x (in B) and t (in T).
synonym
antonym
vector preservation
hypernym: high- low- level distance j is given
Total loss is defined:
3.2 Metrics
Previous methods use cosine similarity (angle), but this model considers angle and norm. In this paper, they propose a new metric considering both of them.
4. How did evaluate it?
Hypernym tasks
5. Is there a discussion?
6. Which paper should read next?