a1da4 / paper-survey

Summary of machine learning papers
32 stars 0 forks source link

Reading: Learning Lexical Subspaces in a Distributional Vector Space #199

Open a1da4 opened 3 years ago

a1da4 commented 3 years ago

0. Paper

@article{arora-etal-2020-learning, title = "Learning Lexical Subspaces in a Distributional Vector Space", author = "Arora, Kushal and Chakraborty, Aishik and Cheung, Jackie C. K.", journal = "Transactions of the Association for Computational Linguistics", volume = "8", year = "2020", url = "https://aclanthology.org/2020.tacl-1.21", doi = "10.1162/tacl_a_00316", pages = "311--329", abstract = "In this paper, we propose LexSub, a novel approach towards unifying lexical and distributional semantics. We inject knowledge about lexical-semantic relations into distributional word embeddings by defining subspaces of the distributional vector space in which a lexical relation should hold. Our framework can handle symmetric attract and repel relations (e.g., synonymy and antonymy, respectively), as well as asymmetric relations (e.g., hypernymy and meronomy). In a suite of intrinsic benchmarks, we show that our model outperforms previous approaches on relatedness tasks and on hypernymy classification and detection, while being competitive on word similarity tasks. It also outperforms previous systems on extrinsic classification tasks that benefit from exploiting lexical relational cues. We perform a series of analyses to understand the behaviors of our model.1Code available at https://github.com/aishikchakraborty/LexSub.", }

1. What is it?

They proposed a new approach to dealing with synonyms, metaphors, and meronymy in word embeddings.

2. What is amazing compared to previous works?

Their method can handle semantic relationships (synonyms, antonyms, hypernymy, and meronymy) while preserving the original learned word vector space.

3. Where is the key to technologies and techniques?

スクリーンショット 2021-09-07 14 56 14

They map each semantic relation into one subspace (left). To learn the project matrices (W{syn}, W{hyp}, and W_{mer}), they define three loss functions.

They also define negative sampling losses.

Using these functions, they define relation-specific losses.

Total loss function is sum of these functions: スクリーンショット 2021-09-07 16 57 25 スクリーンショット 2021-09-07 16 57 33

4. How did evaluate it?

4.1 Intrinsic tasks

スクリーンショット 2021-09-07 17 45 23 スクリーンショット 2021-09-07 17 45 37

4.2 Extrinsic tasks

スクリーンショット 2021-09-07 17 46 01

5. Is there a discussion?

From Table 5, each subspace learned each relation-specific information. スクリーンショット 2021-09-07 17 54 21

6. Which paper should read next?

a1da4 commented 3 years ago

Baselines