pochih / RL-Chatbot

🤖 Deep Reinforcement Learning Chatbot
MIT License
419 stars 140 forks source link

How to train reversed model for RL model #10

Open bobbercheng opened 6 years ago

bobbercheng commented 6 years ago

You mentioned " When training with policy gradient (pg)

you may need a reversed model

the reversed model is also trained by cornell movie-dialogs dataset, but with source and target reversed. " Except downloading pre-trained reversed model, could you please tell how to rain it?

Thank you a lot.

pochih commented 6 years ago

For instance, if a sample in training set is ('How is the weather?', 'It's sunny.'). To train a reversed model, you need to reverse the data into ('It's sunny.', 'How is the weather?'). That means you need to predict the former sentence using the later sentence.