Why didn't this implementation has used bucket model ?

Conchylicultor / DeepQA

My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot

Apache License 2.0

2.93k stars 1.17k forks source link

Why didn't this implementation has used bucket model ? #93

Open shamanez opened 7 years ago

shamanez commented 7 years ago

Will it increase accuracy since padding for too long can reduce information.

Conchylicultor commented 7 years ago

This is indeed a missing feature and I'll try to add the TF dynamic padding feature. It should definitely help the performances and speed up the training. However, I don't think it will increase the accuracy. The inputs are reversed so padding don't affect the encoder.

shamanez commented 7 years ago

What do you mean by reversed inputs ? Is that about the input sequence ? But I 'v red in a paper that by keeping fixed size inputs we can loose some information with long sequences. Because if our original input is only two word sequence here we have to add 8 padding symbols. So isn't that highlighting some unwanted information. Can it be fade read information ?

Conchylicultor commented 7 years ago

It may be true if you add the padding to the right of your original sequence (input - padding - decoder), but I don't see why that would be the case when the padding is added to the left (padding - input - decoder).

shamanez commented 7 years ago

So it's like there will be no information passing with the padding when we get it as (In-Pd-De)? Because 0 symbols don't pass anything ? Yeah then it's correct.