Closed farinamhz closed 1 year ago
I read the paper and summarized it. However, I have some questions about the proposed method. After that, I will complete the method section of the summary.
@hosseinfani
@farinamhz we can talk about it tmr at lab
Latent Aspect Detection from Online Unsolicited Customer Reviews
Main problem
The main problem of this paper is to detect the latent aspects in online unsolicited customer reviews which are not mentioned clearly. Aspects are defined as features of products and services, and customers will give their opinion about them as a review. These hidden aspects are not mentioned directly due to the social background of the author and readers.
Existing work
Existing methods to detect aspects in reviews can be divided into three categories based on the level of human supervision of the method:
Rule-based: in this method association rule mining approach is being used to match the aspects with the words. Disadvantage: this method is not scalable when the number of combinations in reviews increases.
Supervised: in this method, we use the supervised machine learning method on a labeled dataset in which all the aspects are clearly shown with human effort. Disadvantage: Human effort for annotating the labels in this method is time-consuming with a high cost, and also leads to bias.
Unsupervised: this method is not under human supervision. Disadvantage: even in this method they still assume that aspects are clearly shown in the review, so it misses out on the hidden aspects.
Inputs
Outputs
Example
Proposed Method
We have a generative process for generating the reviews in the following steps:
We pick an aspect with a high probability out of all aspect's probabilities in the Dirichlet distribution
We pick related words with high probabilities to make the review from the Dirichlet distribution
Coherence score for finding the optimum number of aspects
Resnik similarity score to calculate the inter-word semantic similarities
Experimental Setup
Dataset
Preprocessing Removing numerical, non-English words, stop-words, emojis, and punctuations from reviews
Metrics
Baselines
Results
The main contribution of this paper is to propose an unsupervised model for detecting the latent aspects of noisy and short unsolicited customer reviews. Results show that this unsupervised modeling of aspects as hidden variables leads to more accurate detection in comparison to baselines that detect the aspects which are clearly shown. Besides, the proposed unsupervised method has better results on MRR score in comparison to state-of-the-art supervised methods such as CMLA.
Code https://github.com/MohammadForouhesh/latent-aspect-detection
Presentation There is no available presentation for this paper