LADy
*: A System for Latent Aspect Detection*Suggested by Christine!
LADy
is a python-based
framework to facilitate research in aspect detection
, which involves extracting aspects
of products or services in reviews toward which customers target their opinions and sentiments. Aspects could be explicitly mentioned in reviews or be latent
due to social background knowledge. With a special focus on latent aspect detection
, LADy
hosts various canonical aspect detection methods and benchmark datasets of unsolicited reviews from semeval
and google reviews
.
LADy
's object-oriented design allows for easy integration of new methods
, datasets
, and evaluation metrics
. Notably, LADy
features review augmentation
via natural language backtranslation
that can be seamlessly integrated into the training phase of the models to boost efficiency
and improve efficacy
during inference.
- [1. Setup](#1-setup) - [2. Quickstart](#2-quickstart) - [3. Structure](#3-structure) - [4. Experiment](#4-experiment) - [5. License](#5-license) - [6. Acknowledgments](#6-acknowledgments) - [7. Contribution](#7-contribution) |
|
LADy
has been developed on Python 3.8
and can be installed by conda
or pip
, docker
:
git clone --recursive https://github.com/fani-lab/LADy.git
cd LADy
conda env create -f environment.yml
conda activate lady
git clone --recursive https://github.com/fani-lab/LADy.git
cd LADy
pip install -r requirements.txt
docker run -it --name lady_container ghcr.io/fani-lab/lady:main
This command installs compatible versions of the following libraries:
./src/cmn
:transformers, sentence_transformers, scipy, simalign, nltk
./src/aml
:gensim, nltk, pandas, requests, bitermplus, contextualized_topic_models, fasttext
others:
pytrec-eval-terrier, sklearn, matplotlib, seaborn, tqdm
The following aspect detection baselines will be also cloned as submodules:
bert-e2e-absa
→./src/bert-e2e-absa
hast
→./src/hast
Additionally, the following libraries should be installed:
Microsoft C++ Build Tools
as a requirement of biterm topic modeling in./src/btm.py
.
python -m spacy download en_core_web_sm
python -m nltk.downloader stopwords
python -m nltk.downloader punkt
Further, we reused octis
as submodule ./src/octis
for unsupervised
neural aspect modeling using e.g., neural lda
:
cd src/octis
python setup.py install
For quickstart purposes, a toy
sample of reviews has been provided at ./data/raw/semeval/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml
.
You can run LADy
by:
cd ./src
python main.py -naspects 5 -am rnd -data ../data/raw/semeval/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml -output ../output/toy.2016SB5/
This run will produce an output folder at ../output/toy.2016SB5/
and a subfolder for rnd
aspect modeling (random) baseline.
The final evaluation results are aggregated in ../output/toy.2016SB5/agg.pred.eval.mean.csv
.
LADy
has two layers:
./src/cmn
Common layer (cmn
) includes the abstract class definition for Review
.
Important attributes of Review
are:
self.aos
: stores a list of(aspect, opinion, sentiment)
triples for each sentence of a review, and
self.augs
: stores the translated (Review_
) and back-translated (Review__
) versions of the original review along with the semantic similarity of back-translated version with original review in a dictionay{'lang': (Review_, Review__, semantic score)}
self.parent
: whetherself
is an original review or a translated or back-translated version.
This layer further includes SemEvalReview
, which is a realization of Review
class for reviews of SemEval
datasets.
Specifically, this class overrides loading SemEval
's reviews into Review
objects and stores it into a pickle file after preprocessing.
Pickle file is later used by models for training and testing purposes. Sample pickle files for a toy
dataset: ./output/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml
, there are some
where the filename review.{list of languages}.pkl
shows the review objects also include back-translated versions in {list of languages}
.
./src/aml
Aspect model layer (aml
) includes the abstract class definition AbstractAspectModel
for aspect modeling methods.
Important methods of are:
self.train(reviews_train, reviews_valid, ..., output)
: train the model on input training and validation samples and save the model inoutput
,
self.infer(review)
: infer (predict) the aspect of a given review, which is an ordered list ofself.naspect
aspects with different probability scores, like[(0, 0.7), (1, 0.1), ...]
To view the actual aspect terms (tokens),self.get_aspect_words(aspect_id)
can be used which returns an ordered list of terms with probability scores like[('food', 0.4),('sushi', 0.3), ...]
self.load(path)
: loads a saved trained model.
This layer further includes realizations for different aspect modeling methods like,
Local LDA [Brody and Elhadad, NAACL2010]
in./src/aml/lda.py
,
Biterm Topic Modeling [WWW2013]
in./src/aml/btm.py
,
Contextual Topic Modeling [EACL2021]
in./src/aml/ctm.py
,
BERT-E2E-ABSA [W-NUT@EMNLP2019]
in./src/bert-e2e-absa
,
fastText [Joulin et al., EACL 2017]
in./src/aml/fast.py
,
HAST [IJCAI2018]
in./src/hast
Random
in./src/aml/ctm.py
, which returns a shuffled list of tokens as a prediction for aspects of a review to provide a minimum baseline for comparison.
Sample models trained on a toy
dataset can be found ./output/toy.2016SB5//{model name}
.
class diagram for aspect modeling hierarchy
./src/main.py
LADy
's driver code accepts the following args:
-naspects
: the number of possible aspects for a review in a domain, e.g.,-naspect 5
, like inrestaurant
we may have 5 aspects including['food', 'staff', ...]
-am
: the aspect modeling (detection) method, e.g.,-am lda
, includingrnd
,lda
,btm
,ctm
,nrl
,bert
,fast
,hast
,cat
-data
: the raw review file, e.g.,-data ../data/raw/semeval/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml
-output
: the folder to store the pipeline outputs, e.g.,-output ../output/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml
including preprocessed reviews, trained models, predictions, evaluations, ...
LADy
knows the methods' hyperparameters and evaluation settings from ./src/params.py
Here is the codebase folder structure:
├── src
| ├── cmn
| | ├── review.py -> class definition for review as object
| | ├── semeval.py -> overridden class for semeval reviews
| | ├── twitter.py -> overridden class for twitter reviews
| ├── aml
| | ├── mdl.py -> abstract aspect model to be overridden by baselines
| | ├── rnd.py -> random aspect model that randomly predicts aspects
| | ├── lda.py -> unsupervised aspect detection based on LDA
| | ├── btm.py -> unsupervised aspect detection based on biterm topic modeling
| | ├── ctm.py -> unsupervised aspect detection based on contextual topic modeling (neural)
| | ├── nrl.py -> unsupervised aspect detection based on neural topic modeling
| | ├── bert.py -> supervised aspect detection and sentiment analysis using pre-trained language model and neural end-to-end ABSA
| | ├── fast.py -> supervised aspect detection and sentiment analysis based on text classification (multinomial logistic regression)
| ├── params.py -> running settings of the pipeline
| ├── main.py -> main driver of the pipeline
-output {output}
LADy
runs the pipleline for ['prep', 'train', 'test', 'eval', 'agg']
steps and generates outputs in the given -output
path:
['prep']
: loads raw reviews and generate review objects in{output}/review.{list of languages}.pkl
like./output/toy.2016SB5/
['train']
: loads review objects and create an instance of aspect modeling (detection) method given in-am {am}
.LADy
splits reviews intotrain
andtest
based onparams.settings['train']['ratio']
in./src/params.py
.LADy
further splitstrain
intoparams.settings['train']['nfolds']
for cross-validation and model tuning during training. The result of this step is a collection of trained models for each fold in{output}/{naspect}.{languges used for back-translation}/{am}/
like./output/toy.2016SB5/5.arb_Arab/lda
├── f{k}.model -> saved aspect model for k-th fold ├── f{k}.model.dict -> dictionary of tokens/words for k-th fold
['test']
: predicts the aspects on the test set withparams.settings["test"]["h_ratio"] * 100
% latent aspect meaning that this percentage of the aspects will be hidden in the test reviews. Also, the model will which has been saved in the previous step (train) will be loaded to be used for inference. The results of inference will be pairs of golden truth aspects with the inferred aspects sorted based on their probability that will be saved for each fold in{output}/{naspect}/{am}/
like./output/toy.2016SB5/5/lda
├── f{k}.model.pred.{h_ratio} -> pairs of golden truth and inferred aspects with (h_ratio * 100) % hidden aspects for k-th fold
['eval']
: evaluate the inference results in the test step and save the results for different metrics inparams.settings['eval']['metrics']
for different k inparams.settings["eval"]["topkstr"]
. The result of this step will be saved for each fold in{output}/{naspect}/{am}/
like./output/toy.2016SB5/5/lda
├── f{k}.model.pred.{h_ratio} -> evaluation of inference for k-th fold with (h_ratio * 100) % hidden aspects ├── model.pred.{h_ratio}.csv -> mean of evaluation for all folds with (h_ratio * 100) % hidden aspects
['agg']
: aggregate the inferred result in this step for all the aspect models in all the folds and for all theh_ratio
values will be saved in a file in{output}/
like./output/toy.2016SB
├── agg.pred.eval.mean.csv -> aggregated file including all inferences on a specific dataset
We conducted a series of experiments involving backtranslation using six
different natural languages
that belong to diverse language families
. These experiments aimed to explore the effect of backtranslation augmentation
across various aspect detection methods and domains, particularly in the context of restaurant
and laptop
reviews
, where aspects may not necessarily be explicitly mentioned but are implicitly present with no surface form (latent
). Through our findings, we observed a synergistic impact, indicating that the utilization of backtranslation
enhances the performance of aspect detection
whether the aspect is 'explicit' or 'latent'.
LADy
utilizes state-of-the-art semeval
datasets to augment
the english datasets with backtranslation
via different languages and evaluate latent aspect detection
. Specifically, training sets from semeval-14
for restaurant and laptop reviews, as well as restaurant reviews from semeval-15
and semeval-16
are employed. Training sets from twitter
for reviews on celebrities, products, and companies are also employed. Moreover, we have created a compact and simplified version of the original datasets, referred to as a toy dataset
, for our experimental purposes.
dataset | file (.xml) |
---|---|
semeval-14-laptop | ./data/raw/semeval/SemEval-14/Laptop_Train_v2.xml |
semeval-14-restaurant | ./data/raw/semeval/SemEval-14/Semeval-14-Restaurants_Train.xml |
semeval-15-restaurant | ./data/raw/semeval/2015SB12/ABSA15_RestaurantsTrain/ABSA-15_Restaurants_Train_Final.xml |
semeval-16-restaurant | ./data/raw/semeval/2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml |
./data/raw/twitter/acl-14-short-data/train.raw |
|
toy | ./data/raw/semeval/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml |
The reviews were divided into sentences, and our experiments were conducted on each sentence treated as an individual review, assuming that each sentence represents a single aspect. The statistics of the datasets can be seen in the table below.
exact match | ||||||||
---|---|---|---|---|---|---|---|---|
dataset | #reviews | avg #aspects | chinese | farsi | arabic | french | german | spanish |
semeval-14-laptop | 1,488 | 1.5846 | 0.1763 | 0.2178 | 0.2727 | 0.3309 | 0.3214 | 0.3702 |
semeval-14-restaurant | 2,023 | 1.8284 | 0.1831 | 0.2236 | 0.2929 | 0.3645 | 0.3724 | 0.4088 |
semeval-15-restaurant | 0,833 | 1.5354 | 0.2034 | 0.2312 | 0.3021 | 0.3587 | 0.3907 | 0.4128 |
semeval-16-restaurant | 1,234 | 1.5235 | 0.2023 | 0.2331 | 0.2991 | 0.3556 | 0.3834 | 0.4034 |
6,248 | 1.0000 | 0.0812 | 0.1040 | 0.1593 | 0.1815 | 0.1962 | 0.2171 |
The average performances of 5-fold models with backtranslation and lack thereof have been reported in our CIKM23 paper
, also shown below:
The table below presents the provided links to directories that hold the remaining results of our experiment. These directories consist of diverse aspect detection
models applied to different datasets
and languages
, with varying percentages of latent
aspects.
dataset | review files (english, chinese, farsi, arabic, french, german, spanish, and all) and results' directory |
---|---|
semeval-14-laptop | ./output/Semeval-14/Laptop/ 45.5 GB |
semeval-14-restaurant | ./output/Semeval-14/Restaurants/ 58.6 GB |
semeval-15-restaurant | ./output/2015SB12/ 53.9 GB |
semeval-16-restaurant | ./output/2016SB5/ 55.2 GB |
./output/twitter/ 285 GB |
|
toy | ./output/toy.2016SB5/ 2.37 GB |
Due to OOV (an aspect might be in test set which is not seen in traning set during model training), we may have metric@n for n >> +inf not equal to 1.
©2024. This work is licensed under a CC BY-NC-SA 4.0 license.
In this work, we use LDA
, bitermplus
, OCTIS
, pytrec_eval
, SimAlign
, DeCLUTR
, No Language Left Behind (NLLB)
, HAST
, BERT-E2E-ABSA
, fastText
, and other libraries and models. We extend our gratitude to the respective authors of these resources for their valuable contributions.
We strongly encourage and welcome pull requests from contributors. If you plan to make substantial modifications, we kindly request that you first open an issue to initiate a discussion. This will allow us to have a clear understanding of the modifications you intend to make and ensure a smooth collaboration process.