atsukoba / CNN-for-Sentence-Classification-in-Chainer

an unofficial implementation of "Convolutional Neural Networks for Sentence Classification" with Chainer. Hyperparameter Optimization with Optuna.
MIT License
3 stars 0 forks source link
chainer cnn-classification nlp

CNN-for-Sentence-Classification-in-Chainer

an unofficial implementation of Yoon Kim's Convolutional Neural Networks for Sentence Classification with Chainer.

Abstract (from Cornell university library) We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.

Requirements

Sample Texts for Classification

datasets from cornell dataset

# data location
data/
    |_pos/
    |    |_cv000_01.txt
    |    |_cv000_02.txt
    |      :
    |_neg/
        |_cv000_01.txt
        |_cv000_02.txt
           :

or get data like this.

cd data
wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O imdb.tar.gz
tar -xf imdb.tar.gz

Sample Classification Demo

data loading

import data_builder

data = data_builder.load_imdb_data()

data.get_info()
Data Info imdb
------------------------------
Vocab: 18744
Sentences: 10662
------------------------------
x_train: (5331, 1, 53)
x_test: (5331, 1, 53)
y_train: (5331,)
y_test: (5331,)

and train.

import cnnsc

clf = cnnsc.sample_train(data, model_type="CNN_rand")

results...

epoch       elapsed_time  main/loss   validation/main/loss  main/accuracy  validation/main/accuracy
1           129.698       0.701306    0.692502              0.518555       0.497461                  
2           256.704       0.659536    0.689695              0.648438       0.550195                  
3           382.597       0.614391    0.688409              0.708333       0.532812                  
4           513.592       0.542008    0.687492              0.810547       0.566992                  
5           638.055       0.427749    0.695113              0.917969       0.516992             

other models need word2vec embedding. exec embed() before start training.

data.embed()

clf = cnnsc.sample_train(data, model_type="CNN_static")
# or
clf = cnnsc.sample_train(data, model_type="CNN_non_static")
# or
clf = cnnsc.sample_train(data, model_type="CNN_multi_ch")

[WIP] Usage

when use other data

import cnnsc, data_builder

data = data_builder.Data("DATANAME", "LIST-OF-FILEPATH", "LABELS").load()
dataset = data.get_chainer_dataset()

clf = cnnsc.train(dataset=dataset)