Qusetion about ease of answering

pochih / RL-Chatbot

🤖 Deep Reinforcement Learning Chatbot

MIT License

418 stars 140 forks source link

Open luomuqinghan opened 6 years ago

luomuqinghan commented 6 years ago

Thanks for your sharing.

In RL for ease of answering, the reward is calculated by RL model itself, not another model?

Why not input the action into another pretrained model to obtain the response, and measure its likelihood with a dull response?