localminimum / QANet

A Tensorflow implementation of QANet for machine reading comprehension
MIT License
983 stars 305 forks source link

Is it a better way to use conv layer in highway or encoder block feed forward network rather than dense layer? #20

Closed hackiey closed 6 years ago

hackiey commented 6 years ago

The author didn't mention they use conv layer in paper. thanks for any reply!

localminimum commented 6 years ago

Hi @hackiey thanks for your question. The "conv" function used here has an argument "kernel_size" and is by default equal to 1 if not specified. This is actually 1x1 convolution which is same as fully connected network. I just use conv function instead of dense layer because it is more robust with different input ranks. If this sounds confusing, check out here.

The only time "conv" function is actually used for narrow convolution is here where the kernel_size is 5.