fani-lab / LADy

LADy 💃: A Benchmark Toolkit for Latent Aspect Detection Enriched with Backtranslation Augmentation
Other
4 stars 5 forks source link

2020-ACL- CAt: Embarrassingly Simple Unsupervised Aspect Extraction #16

Closed farinamhz closed 1 year ago

farinamhz commented 1 year ago

This issue page is for a summary of the paper named "Embarrassingly Simple Unsupervised Aspect Extraction," published in ACL 2020, and it can be accessed using this link.

hosseinfani commented 1 year ago

@farinamhz where is the summary? you were supposed to finish it by now!

farinamhz commented 1 year ago

@hosseinfani Hi, I will finish it by tonight.

farinamhz commented 1 year ago

Main problem

The main problem of this paper is to detect the aspects of the text that specifically works on the restaurant domain. In sentiment analysis, an aspect is a dimension on which an entity is evaluated. Aspects can be in two forms: 1. Concrete, like "a laptop battery" 2. Subjective, like "the loudness of a motorcycle."

Existing work

Existing methods to detect aspects in reviews can be divided into two categories based on the level of human supervision of the method:

  1. Supervised: Most existing systems are supervised, and due to the reason that aspects are domain-specific, these supervised methods can not transfer data well between different domains. Also, we do not have much training data in all the domains and different languages, so these supervised methods are unsuitable in many cases.

  2. Unsupervised: Besides those supervised methods, there are some existing works on unsupervised methods, including topic models and other neural models which have a complex architecture with a large number of parameters, although a simpler model can reach that accuracy in aspect extraction.

Inputs

Outputs

Example

Proposed Method

Unlike unsupervised deep neural networks that fit on some corpus and also require users to link discovered aspects to labels manually, this model links labels to aspects automatically, and it is not fit on a specific corpus. The figure below is the schematic view of the Contrastive attention model:

image

Given a sentence S, we want to find aspect and a set of aspect terms A which are represented by word embeddings: First, the model creates an attention-weighted average using the similarity between words and aspects. The RBF kernel defines this similarity. So it has an attention vector as the output that weighs the sentence with this vector. Then, it assigns the sentence label of the closest aspect term by linking the sentence summary to one of three labels. Attention is calculated as the sum of all RBF responses to a word divided by the sum of the RBF responses to all words. Formulas of these two are attached below:

image image

They calculate the RBF kernel between every word and every aspect in the sentence for each word divided by all RBF responses. So they create an attention distribution over a sentence (like probability distribution). Then if they multiply the attention distribution by the tokens in the sentence, it will be like a standard attention distribution.

Experimental Setup

Dataset

All datasets have been annotated with one or more labels for aspect in a sentence.

  1. Test set (and Evaluation): Citysearch dataset
  2. Development sets: Restaurant subsets of the SemEval 2014 and SemEval 2015.

Metrics

In addition, they compared weighted macro averages.

Baselines

Results

The main contribution of this paper is to propose an unsupervised model for detecting the aspects of reviews. Results show that this unsupervised method using frequency together with an attention mechanism based on RBF kernels and automated aspect assignment method leads to more accurate detection in comparison to baselines that have more complex architecture, such as neural models and topic modelings. Results are compared for the baselines and three models in this paper. The first model is the Mean which is the mean of the word vectors, and the second one is Attention which is the dot-product of attention, and CAt is the complete model containing both.

Code

There is a repository in GitHub, which is the official implementation of this paper.

Presentation

There is a presentation in this link.