facebookresearch / ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
https://parl.ai
MIT License
10.49k stars 2.1k forks source link

Any plan for providing reinforcement learning? #244

Closed lifelongeek closed 7 years ago

lifelongeek commented 7 years ago

As far as I know, ParlAI does not provide reinforcement learning dialog example for now.

Anyone considering to commit it? If so, could you share some info such as

jaseweston commented 7 years ago

There are no RL code examples right now, but ParlAI is set up to use RL. E.g. we included a reward field in the action/observation message... I think these datasets in ParlAI use it (but it's fixed, so more imitation learning): DBLL-bAbI and DBLL-Movie

lifelongeek commented 7 years ago

Thanks. Could you provide some detail info of DBLL-bAbI & DBLL-Movie?

reference document (probably the paper "Dialog based Language learning"?) example script (probably : python examples/memnn_luatorch_cpu/full_task_train.py -t dbll_movie -bs 32 )

jaseweston commented 7 years ago

On Thu, Aug 3, 2017 at 12:56 AM, Geonmin Kim notifications@github.com wrote:

Thanks. Could you provide some detail info of DBLL-bAbI & DBLL-Movie?

reference document (probably the paper "Dialog based Language learning"?)

Yes, in general you can see dataset details here: https://github.com/facebookresearch/ParlAI/blob/master/parlai/tasks/task_list.py

example script (probably : python examples/memnn_luatorch_cpu/full_task_train.py

-t dbll_movie -bs 32 )

you could use any model potentially yes

python examples/train_model.py -m seq2seq -t dbll-babi -bs 8 -mf /tmp/model_s2s

but we don't have any models yet that use rewards or forward prediction (please add if you like! and we are always adding more, so more soon..)

You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/ParlAI/issues/244#issuecomment-319867721, or mute the thread https://github.com/notifications/unsubscribe-auth/AKjk-Ef7O7vyz0yOFgzyBneJiksXl0Q0ks5sUVLrgaJpZM4On9wJ .