wpm / mycroft

Text classifier
MIT License
3 stars 2 forks source link

Add support for a convnet/pooling model #17

Closed wpm closed 7 years ago

wpm commented 7 years ago

Another kind of neural model for sequences.

wpm commented 7 years ago

See

Section 3.1 of Kim's paper describes his hyper-parameters. He also uses l2 regularization.

wpm commented 7 years ago

From the Keras Slack Channel

Me

I'm trying to repro an experiment in which someone used a 1d convnet/maxpooling strategy over words in a sentence. (Yoon Kim, 2014, "Convolutional Neural Networks for Sentence Classification") In that experiment the convolution filters represent sliding windows over consecutive tokens. They ran a model with window sizes 3, 4, and 5.

How would I build an equivalent model with multiple window sizes in Keras? Do I have my input feed into three different Conv1D layers (or pairs of Conv1D and MaxPooling1D layers) with different kernel_size values and then concatenate the results into a single vector?

dref306

yup

wpm commented 7 years ago

A model that isn't working right now.

Convolutional text sequence classifier: 2 labels, 100 filters, kernel size 3, pool factor 4, dropout rate 0.50
Text sequence embedder: core_web_sm, embedding matrix (20000, 300)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 22, 300)           6000000   
_________________________________________________________________
convolution (Conv1D)         (None, 20, 100)           90100     
_________________________________________________________________
pooling (MaxPooling1D)       (None, 5, 100)            0         
_________________________________________________________________
softmax (Dense)              (None, 5, 2)              202       
_________________________________________________________________
dropout (Dropout)            (None, 5, 2)              0         
=================================================================
Total params: 6,090,302.0
Trainable params: 90,302.0
Non-trainable params: 6,000,000.0
_________________________________________________________________