keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.29k stars 19.38k forks source link

Multi-channel architecture in keras #1308

Closed sebastianruder closed 8 years ago

sebastianruder commented 8 years ago

I am trying to implement the multi-channel CNN architecture that uses two word embedding channels proposed in [1].

From the paper: "In the multi-channel architecture [...], each filter is applied to both channels and the results are added to calculate (the feature generated by each filter for a window)."

As I understand, we slide each filter over the windows for each channel. This produces an output vector of length n - h + 1 where n is the sequence length and h is the window size. Is this correct?

My question is now how to implement this in keras so that the same convolution applies to the windows at each position in the channel.

My intuitive approach was having a Graph network which takes the channels stored in inputs and applies a convolution on them:

graph.add_node(Convolution1D(nb_filter=nb_filters, filter_length=7, border_mode="valid",
                                                 activation="relu", subsample_length=1),
                                   name='conv, inputs=inputs, merge_mode='concat')

However, this doesn't seem to match with my understanding of the architecture above, where the values are added after the convolution and are not merged before.

Can you help me? Thanks a lot in advance!

Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Retrieved from http://arxiv.org/abs/1408.5882

devroy73 commented 8 years ago

Hi Sebastian have you had any luck in solving this? Regards and Thanks

sebastianruder commented 8 years ago

Hey @devroy73, thanks for checking in. I ended up using a separate Merge layer instead, which seemed to work.

devroy73 commented 8 years ago

Hi Sebastian it would be great if you could share the snippet

burgersmoke commented 7 years ago

As @sebastianruder mentions, I believe that it can be done in a separate Merge layer. Here's a post where someone tried to do something similar by using one channel for word2vec embeddings and another for GloVe embeddings. The relevant code ("snippet") is in the article but I've not tried this yet. https://medium.com/@dsouza.amanda/multi-channel-cnn-for-text-699713aa98a7