Closed rzhao1 closed 4 years ago
rulebased: rule parser + rule policy + retrieval generator hybrid: rule parser + learned policy + retrieval generator cmd: human-controlled bot used in a command line interface fb-neural: obsolete pt-neural: seq2seq
Thank you for the clarification! It is really helpful! Basically, I want to run use hybrid model to generate the simulated dialogue. However, when I run the following command, I got the error "AttributeError: 'HybridSystem' object has no attribute 'env'".
PYTHONPATH=. python reinforce.py --schema-path data/bookhatball-schema.json \ --scenarios-path data/train-scenarios.json \ --valid-scenarios-path data/val-scenarios.json \ --agent-checkpoints checkpoint/lf2lf/model_best.pt checkpoint/lf2lf/model_best.pt \ --model-path checkpoint/lf2lf-margin \ --optim adagrad --learning-rate 0.001 \ --agents hybrid pt-neural \ --report-every 500 --max-turns 20 --num-dialogues 5000 \ --sample --temperature 0.5 --max-length 20 --reward margin\ -- templates templates.pkl\ --policy model.pkl
I noticed that HybridSystem class does not have function of loading trained policy?
The parameters are loaded through the manager
object of HybridSystem, which is the learned policy. Also, I don't think the command would run, because you cannot back-propagate through hybrid.
Thank you! If I want to create an agent that use neural dialogue model as manager and rule-based template as generator (a hybrid system) to talk with human user, which kind of command I should use? The default commands you provided in README only output dialogue act instead of the utterance?
One more extra question: what does it mean lf2lf vs lflm? Thank you!
You can use https://github.com/stanfordnlp/cocoa/blob/master/scripts/generate_dataset.py to generate bot-bot/human chat by setting one agent to be hybrid
and the other to be cmd
(human). lflm
means learning a LM as the action predictor instead of a seq2seq model, this is obsolete though.
Thank you! What does "LF" stand for?
logical form
Could you please clarify the meaning of each model in the code of paper "Decoupling Strategy and Generation in Negotiation Dialogues"?
Thank you very much!