ArdalanM / nlp-benchmarks

129 stars 24 forks source link

Input layer missing activation function #3

Closed opringle closed 5 years ago

opringle commented 6 years ago

Should there not be an activation function after the first 3, Temp Conv, 64 layer? The paper does not specify but I assumed every convolutional layer should be followed by batch normalization + relu in my own implementation.

ArdalanM commented 6 years ago

Hi @opringle,

Thanks for your feedback.

The activation functions (e.g ReLU) are pretty well mentioned throughout the papers. AFAIK, the first Temp Conv, 64 layer act as n-grams generator, followed by the Convolutional block.

Thus the first layers of the network are as follows:

Seeing back to back Convolutions at the early stage of the network did not surprised me more than that as I thought it would help crafting more features (Convolutions) before selecting them (ReLU, Pooling, etc...)

Do you manage to replicate the results with your implementation of VDCNN ?

Cheers !