fani-lab / LADy

LADy 💃: A Benchmark Toolkit for Latent Aspect Detection Enriched with Backtranslation Augmentation
Other
5 stars 6 forks source link

Summary of SemEval 14,15,16 Aspect-based Sentiment Analysis task #81

Open Sepideh-Ahmadian opened 2 months ago

Sepideh-Ahmadian commented 2 months ago

The following report provides a summary and comparison of three SemEval datasets—2014 (Task 4), 2015 (Task 12), and 2016 (Task 5)—that have been used in LADy. These tasks focus on aspect-based sentiment analysis. In addition to detecting sentiment, it is essential to identify the entity and the feature to which the sentiment is directed. This report details the various versions of these tasks and their respective characteristics.

SemEval 14 Task 4:

Datasets' domain : Laptop, Restaurant

Subtasks descriptions: (4 subtasks)

1. SB1

2. SB2

3. SB3

4. SB4

Tasks summary

Data Collection

  1. Restaurant: 3,041 English sentences, a subsection of Ganu et al. (2009) work.
    • The work includes aspect-categories as mentioned in SB3 and overall polarity. Additional tags were added later. Additional restaurant reviews were gathered and annotated from scratch for the test datasets.
  2. Laptop: Contains 3,845 reviews in total (3,045 train + 800 test).
    • It was tagged by human annotators for SB1,2.

Data Annotation The annotators used BART, a web-based annotation tool configured for the task. Using BART, they provided the aspect term (SB1), aspect term polarity (SB2), aspect category (SB3), and aspect category polarity (SB4).

  1. Stage 1: Aspect Terms and Polarity Annotators tagged the explicit aspects mentioned in the sentence and determined their polarity. Example: "I hated their fajitas, but their salads were great" → {‘fajitas’: negative, ‘salads’: positive}.

  2. Stage 2: Aspect Category and Polarity In this stage, annotators assigned a predefined set of aspect categories to the sentences and determined their polarity. Example: "The restaurant was expensive, but the menu was great" → {PRICE: negative, FOOD: positive}.

Format : XML Metric to measure the results F1 and accuracy.


SemEval 15 Task 12

Dataset Categories: Laptops, Restaurants, Hotels. In contrast to SemEval 14, aspect terms correspond to explicit mentions of entities or attributes. The dataset consists of entire reviews, not isolated sentences.

Subtasks descriptions:

  1. Subtask 1: Given a review about a laptop or a restaurant, provide the following pairs in each slot:

    • Slot 1: Aspect Category: Identify all the entity-attribute pairs (E#A) toward which an opinion has been expressed in the review (predefined E#A pairs).
    • Slot 2: Opinion Target Expression (OTE): Find the linguistic expression related to the E#A pair. If none exists, mark it as "Null."
    • Slot 3: Sentiment Polarity: For each E#A pair, assign one of the following polarities: positive, negative, or neutral. Example: "The food was delicious but do not come here on an empty stomach." → {category= “FOOD#QUALITY”, target= “food”, from: “4”, to: “8”, polarity= “positive”}
  2. Subtask 2: Out-of-Domain ABSA

    • Participants tested their system on an unseen domain (hotel reviews) with no training data.

Data Collection:

  1. Laptops

    • Train: [review texts: 277, sentences: 1,739],
    • Test: [review texts: 173, sentences: 760]
  2. Restaurants

    • Train: [review texts: 254, sentences: 1,315],
    • Test: [review texts: 96, sentences: 685]

Data Annotation Methods: Similar to SemEval 14, with the following differences:

  1. There is no OTE in the laptop section, as features are expressed through a limited number of expressions.
  2. The "conflict" tag has been removed.
  3. The "neutral" tag does not indicate objectivity.

Format: XML Metrics to Measure Results: F1


SemEval 16 Task 12

Dataset Categories: Restaurants, laptops, hotels, mobile phones, digital cameras, telecommunications. In the third year of this task’s evolution, 19 training datasets and 20 testing datasets have been added across 8 languages (Arabic, Chinese, Dutch, French, Russian, Spanish, and Turkish) and 7 domains. Of these datasets, 25 are designed for sentence-level sentiment analysis, and 14 for text-level Aspect-Based Sentiment Analysis (ABSA).

Subtasks Description

  1. Subtask 1: Sentence-Level ABSA. Given a sentence containing an opinion about a targeted entity, the goal is to identify opinion tuples with the following types of information:
  1. Subtask 2: Text-Level ABSA Given a review, the goal is to identify a set of categories and their corresponding polarities for summarizing the expressed opinion. Example: “The so-called laptop runs too slow and I hate it! Do not buy it! It is the worst laptop ever.” → {cat: “laptop#general”, pol: “negative”}, {cat: “laptop#operation_performance”, pol: “negative”}

  2. Subtask 3: Out-of-Domain ABSA Participants can test their models on domains where no prior information is available.

Dataset Overview

Data Annotation Methods The annotated data in each language was prepared by native researchers in the field.