pochih / RL-Chatbot

🤖 Deep Reinforcement Learning Chatbot
MIT License
418 stars 140 forks source link

Qusetion about ease of answering #18

Open luomuqinghan opened 6 years ago

luomuqinghan commented 6 years ago

Thanks for your sharing.

In RL for ease of answering, the reward is calculated by RL model itself, not another model?

Why not input the action into another pretrained model to obtain the response, and measure its likelihood with a dull response?