maxbrenner-ai / GO-Bot-DRL

Goal-Oriented Chatbot trained with Deep Reinforcement Learning
MIT License
178 stars 83 forks source link

The rewards and the avg success rate doesn't improve with 4k epochs #4

Closed stchau4work closed 4 years ago

stchau4work commented 4 years ago

The configuration parameters are shown below:

Screenshot 2020-05-24 at 6 45 10 PM

I had modified the code to add-in TensorBoard support (v2.1.0) and trained in co-lab with GPU

https://github.com/stchau4work/GO-Bot-DRL/commit/7a4aaf761902632ff84fce6725f5aa33b551e821

However, from the chart, it looks like the agent is not able to get positive rewards and the average success rate is kept at zero all the time.

Screenshot 2020-05-24 at 6 43 14 PM

Could you kindly have a review and see if I am missing something?

stchau4work commented 4 years ago

What is the epsilon init you suggest to use?