EngSalem / TextClassification_Off_the_shelf

8 stars 7 forks source link

Text Classification off the shelf library

This is a simple text classification library, based on keras. Some Arabic text normalization utilities are included.

Current Implemented Models:

1- Word Level CNN based on: "Convultion Neural Network for Text Classificartion" url: http://www.aclweb.org/anthology/D14-1181

2- Word Level C-LSTM based on: "A C-LSTM Neural Network for Text Classification" url:https://arxiv.org/pdf/1511.08630.pdf

3- Recurrent Network and its variants (BiLSTM, LSTM, GRU, BiGRU, Attention-BiLSTM)

4- Models implemented but currently not supported in options (Attention-LSTM,Attention-BiGRU).

5- Not yet tested (char level CNN).

Requirements

General Usage:

Options details

Note: final model score is dumped into a file with name_of_model_score with both dev and test scores

Example Project (Arabic Dialect Identification with Deep Models)

@inproceedings{mageedYouTweet2018,
  title={You Tweet What You Speak: A City-Level Dataset of Arabic Dialects},
  author={Abdul-Mageed, Muhammad and Alhuzali, Hassan and Elaraby, Mohamed},
  booktitle={LREC},
  pages={3653--3659},
  year={2018}
}