Add support for a convnet/pooling model

wpm / mycroft

Text classifier

MIT License

3 stars 2 forks source link

Add support for a convnet/pooling model #17

Closed wpm closed 7 years ago

wpm commented 7 years ago

Another kind of neural model for sequences.

wpm commented 7 years ago

See

Kim, Y. Convolutional neural networks for sentence classification. In EMNLP, 2014.
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. Natural language processing (almost) from scratch. In JMLR, 2011

Section 3.1 of Kim's paper describes his hyper-parameters. He also uses l₂ regularization.

wpm commented 7 years ago

From the Keras Slack Channel

I'm trying to repro an experiment in which someone used a 1d convnet/maxpooling strategy over words in a sentence. (Yoon Kim, 2014, "Convolutional Neural Networks for Sentence Classification") In that experiment the convolution filters represent sliding windows over consecutive tokens. They ran a model with window sizes 3, 4, and 5.

How would I build an equivalent model with multiple window sizes in Keras? Do I have my input feed into three different Conv1D layers (or pairs of Conv1D and MaxPooling1D layers) with different kernel_size values and then concatenate the results into a single vector?

dref306

yup

wpm commented 7 years ago

A model that isn't working right now.

Convolutional text sequence classifier: 2 labels, 100 filters, kernel size 3, pool factor 4, dropout rate 0.50
Text sequence embedder: core_web_sm, embedding matrix (20000, 300)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 22, 300)           6000000   
_________________________________________________________________
convolution (Conv1D)         (None, 20, 100)           90100     
_________________________________________________________________
pooling (MaxPooling1D)       (None, 5, 100)            0         
_________________________________________________________________
softmax (Dense)              (None, 5, 2)              202       
_________________________________________________________________
dropout (Dropout)            (None, 5, 2)              0         
=================================================================
Total params: 6,090,302.0
Trainable params: 90,302.0
Non-trainable params: 6,000,000.0
_________________________________________________________________