hafizhassaan / Roman-Urdu-Toxic-Comments

Data and Source Codes for Roman Urdu Toxic Comment Classification
1 stars 2 forks source link

Roman-Urdu-Toxic-Comments

This repository contains dataset and experimental code for the work described in: Roman Urdu Toxic Comment Classification

Dataset (RUT Corpus)

The labeled Roman Urdu Toxic Comment Corpus can be accessed here

Word Embeddings

Pre-trained word embeddings used in the experiments can be found here

Requirement(s)

It is implemented in python 3.6 and requires keras and tensorflow.

For Citing our Work

@article{Saeed2021_RUTox,
  title={Roman Urdu toxic comment classification},
  author={Saeed, Hafiz Hassaan and Ashraf, Muhammad Haseeb and Kamiran, Faisal and Karim, Asim and Calders, Toon},
  journal={Language Resources and Evaluation},
  pages={1--26},
  publisher={Springer},
  year={2021},
  month={Jan},
  day={29},
  issn={1574-0218},
  doi={10.1007/s10579-021-09530-y},
  url={https://doi.org/10.1007/s10579-021-09530-y}
}