2020-ACL- CAt: Embarrassingly Simple Unsupervised Aspect Extraction

farinamhz commented 1 year ago

This issue page is for a summary of the paper named "Embarrassingly Simple Unsupervised Aspect Extraction," published in ACL 2020, and it can be accessed using this link.

hosseinfani commented 1 year ago

@farinamhz where is the summary? you were supposed to finish it by now!

farinamhz commented 1 year ago

@hosseinfani Hi, I will finish it by tonight.

farinamhz commented 1 year ago

Main problem

The main problem of this paper is to detect the aspects of the text that specifically works on the restaurant domain. In sentiment analysis, an aspect is a dimension on which an entity is evaluated. Aspects can be in two forms: 1. Concrete, like "a laptop battery" 2. Subjective, like "the loudness of a motorcycle."

Existing work

Existing methods to detect aspects in reviews can be divided into two categories based on the level of human supervision of the method:

Supervised: Most existing systems are supervised, and due to the reason that aspects are domain-specific, these supervised methods can not transfer data well between different domains. Also, we do not have much training data in all the domains and different languages, so these supervised methods are unsuitable in many cases.
Unsupervised: Besides those supervised methods, there are some existing works on unsupervised methods, including topic models and other neural models which have a complex architecture with a large number of parameters, although a simpler model can reach that accuracy in aspect extraction.

Inputs

A collection of reviews with no human supervision

Outputs

Aspects in the reviews include these three types of aspects: Food, service, and ambience. This is because they want to compare their work with others, and they found out that previous work reported that the other labels, Anecdotes, and Price, were not reliably annotated.

Example

"The sushi was great": The label which is the aspect is "Food" in this case.

Proposed Method

Unlike unsupervised deep neural networks that fit on some corpus and also require users to link discovered aspects to labels manually, this model links labels to aspects automatically, and it is not fit on a specific corpus. The figure below is the schematic view of the Contrastive attention model:

Given a sentence S, we want to find aspect and a set of aspect terms A which are represented by word embeddings: First, the model creates an attention-weighted average using the similarity between words and aspects. The RBF kernel defines this similarity. So it has an attention vector as the output that weighs the sentence with this vector. Then, it assigns the sentence label of the closest aspect term by linking the sentence summary to one of three labels. Attention is calculated as the sum of all RBF responses to a word divided by the sum of the RBF responses to all words. Formulas of these two are attached below:

They calculate the RBF kernel between every word and every aspect in the sentence for each word divided by all RBF responses. So they create an attention distribution over a sentence (like probability distribution). Then if they multiply the attention distribution by the tokens in the sentence, it will be like a standard attention distribution.

Experimental Setup

Dataset

All datasets have been annotated with one or more labels for aspect in a sentence.

Test set (and Evaluation): Citysearch dataset
Development sets: Restaurant subsets of the SemEval 2014 and SemEval 2015.

Metrics

Precision
Recall
F-scores

In addition, they compared weighted macro averages.

Baselines

W2VLDA: Topic modeling method that computes the similarity from a word to a set of aspects. (Unsupervised)
SERBM: This model learns topic distributions and assigns words to the distributions. Therefore, it assigns words to aspects. (Unsupervised)
ABAE: An attention-based system that learns an attention distribution over words in the sentence by considering the global context and aspect vectors. In fact, it is an autoencoder using an attention mechanism. (Unsupervised)
AE-CSA: An attention-based model that is similar to ABAE but also considers sense and sememe vectors. (Unsupervised) What is sememe?: A sememe is a semantic language unit of meaning correlative to a morpheme. For example, the -er in singer would be a sememe pointing to someone performing the action of singing. Sing- is another instance of a sememe in this sense. What is sense?: In linguistics, a word sense is one of the meanings of a word. For example, a dictionary may have over 50 different senses of the word "play," each of these having a different meaning based on the context of the word's usage in a sentence.
A baseline based on the mean of word embeddings,

Results

The main contribution of this paper is to propose an unsupervised model for detecting the aspects of reviews. Results show that this unsupervised method using frequency together with an attention mechanism based on RBF kernels and automated aspect assignment method leads to more accurate detection in comparison to baselines that have more complex architecture, such as neural models and topic modelings. Results are compared for the baselines and three models in this paper. The first model is the Mean which is the mean of the word vectors, and the second one is Attention which is the dot-product of attention, and CAt is the complete model containing both.

Code

There is a repository in GitHub, which is the official implementation of this paper.

Presentation

There is a presentation in this link.

fani-lab / LADy