Paper
Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction
Introduction
There is a lack of fine-grained data in some domains of opinion analysis, restricting the development of supervised models in those domains. In recent efforts, researchers suggest models to map labeled data (knowledge) in one domain to other domains.
Main Problem
To alleviate the mentioned problem the unsupervised domain adaptation methods have been proposed to produce more targeted compatible comments.
Illustrative Example
Given sentences: I like the spicy tuna roll
Output: lightweight and long battery life
Input
A sentence in one domain.
Output
A sentence in the other domain.
Related works and their gaps
Rule-based adaptation (Li et al., 2012; Ding et al., 2017): hard to design high-quality manual rules and opinion set
Feature-based adaptation (Wang and Pan, 2018; Li et al., 2019; Pereg et al., 2020; Chen and Qian, 2021): the main task is trained by source labeled data, which fails to capture the important information in the target
Data augmentation-based adaptation: (Yu et al., 2021) The quality and diversity of generated data are limited since they capture the source's domain reviews.
Contribution of this paper
They proposed a Generative Cross-Domain Data Augmentation framework for unsupervised domain adaptation.
Their suggested method shows promising results which leads to generating more fluent and diversified reviews in comparison to previous domains.
Proposed methods
Not included
Experiments
Model
pre-trained sequence to sequence BART
Datasets
Restaurant and Laptop datasets from SemEval 2014 and 2015 (Pontiki et al., 2014, 2015) and Device consists of reviews from digital devices collected by (Hu and Liu, 2004).
Gaps this work
The proposed method only considers a single source domain. I believe it would be more effective to include multiple domains to develop more comprehensive patterns for data generation in the target domain. Additionally, the method is limited to the English language. Expanding it to other languages would increase its applicability and effectiveness.
Paper Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction
Introduction There is a lack of fine-grained data in some domains of opinion analysis, restricting the development of supervised models in those domains. In recent efforts, researchers suggest models to map labeled data (knowledge) in one domain to other domains.
Main Problem To alleviate the mentioned problem the unsupervised domain adaptation methods have been proposed to produce more targeted compatible comments.
Illustrative Example Given sentences: I like the spicy tuna roll Output: lightweight and long battery life
Input A sentence in one domain.
Output A sentence in the other domain.
Related works and their gaps Rule-based adaptation (Li et al., 2012; Ding et al., 2017): hard to design high-quality manual rules and opinion set Feature-based adaptation (Wang and Pan, 2018; Li et al., 2019; Pereg et al., 2020; Chen and Qian, 2021): the main task is trained by source labeled data, which fails to capture the important information in the target Data augmentation-based adaptation: (Yu et al., 2021) The quality and diversity of generated data are limited since they capture the source's domain reviews.
Contribution of this paper They proposed a Generative Cross-Domain Data Augmentation framework for unsupervised domain adaptation. Their suggested method shows promising results which leads to generating more fluent and diversified reviews in comparison to previous domains.
Proposed methods Not included
Experiments Model pre-trained sequence to sequence BART Datasets Restaurant and Laptop datasets from SemEval 2014 and 2015 (Pontiki et al., 2014, 2015) and Device consists of reviews from digital devices collected by (Hu and Liu, 2004).
Implementation https://github.com/NUSTM/GCDDA
Gaps this work The proposed method only considers a single source domain. I believe it would be more effective to include multiple domains to develop more comprehensive patterns for data generation in the target domain. Additionally, the method is limited to the English language. Expanding it to other languages would increase its applicability and effectiveness.