daniel-kukiela / nmt-chatbot

NMT Chatbot
GNU General Public License v3.0
385 stars 213 forks source link

Result of the chatbot depending layers and nodes #94

Closed RavinSG closed 5 years ago

RavinSG commented 6 years ago

Hello, I'm currently training the chatbot with about 2M of pairs. I was wondering whether someone has tried to compare the output of the bot with how wide or deep the nn is. I kind of went through all the posts regarding the git and almost all of them had only used 2 layers. Is it because they have come to a conclusion that 2 layers is the optimal solution or haven't tried to experiment with it yet?

daniel-kukiela commented 5 years ago

It seems like more layers doesn't help much, but i didn't try to do a comparation. It's more about fitting a model into VRAM during training. With more layers you will have to use less number of tokens (smaller dictionary) or lower batch size.