MElkamhawy / PlaceboAffect

This repo is used for our team PlaceboAffect for LING 573 Course at UW Seattle.
Apache License 2.0
6 stars 0 forks source link

PlaceboAffect

This repo is used for our team PlaceboAffect for LING 573 Course at UW Seattle.

Table of Contents

Project Overview

We have developed an affect recognition system for SemEval-2019 Task 5, focusing on identifying hate speech in both English and Spanish tweets targeting immigrants and women. It employs a binary classification approach using a Word2Vec model for word embeddings and a Support Vector Machine (SVM) algorithm. To enhance the system's performance, we have incorporated additional lexical features such as n-grams and sentiment scores. For the Spanish tweets, we have employed a translation-based approach and integrated it into our existing pipeline.

Folder Structure

The project follows a structured folder organization to store data, models, outputs, results, scripts, and source code. Here is an overview of the folder structure:

├── data
│   ├── dev
│   │   ├── en
│   │   ├── es
│   │   └── es2en
│   ├── test
│   │   ├── en
│   │   ├── es
│   │   └── es2en
│   └── train
│       ├── en
│       ├── es
│       └── es2en
├── doc
├── models
│   ├── D2
│   ├── D3
│   └── D4
│       ├── adaptation
│       └── primary
├── outputs
│   ├── D2
│   ├── D3
│   └── D4
│       ├── adaptation
│       │   ├── devtest
│       │   └── evaltest
│       └── primary
│           ├── devtest
│           └── evaltest
├── results
│   ├── D2
│   ├── D3
│   └── D4
│       ├── adaptation
│       │   ├── devtest
│       │   └── evaltest
│       └── primary
│           ├── devtest
│           └── evaltest
├── scripts
├── setup
└── src
    ├── configs
    ├── features
    └── modeling

Setup

To set up the project environment, follow the steps below:

  1. Navigate to the "setup" folder using the command line:

    cd setup
  2. Change the permission of the create_env.sh script to make it executable:

    chmod +x create_env.sh
  3. Run the create_env.sh script to create the conda environment:

    ./create_env.sh
  4. Activate the newly created environment:

    conda activate PlaceboAffect

Components

The project consists of the following components:

Scripts

Configs

The project includes individual config files for each model, providing the flexibility to enable or disable specific featurse that are relvant to that model. These config files can be found in the configs directory. By modifying the config file corresponding to a particular model, you can control the specific features and settings used during training and testing.

Outputs

For each model, the system generates separate outputs for the primary task and adaptation task by specifying the task. Additionally, both the devtest and evaltest outputs are automatically generated for each task. Additionally, both the devtest and evaltest outputs are generated automatically for each task. For example, for the BOW Model (baseline), the specific files generated include:

For other models, please refer to the following directories for more information:

These directories contain the model files, output files, and result files corresponding to each model.

License

Distributed under the Apache License 2.0. See LICENSE for more information.

Authors