facebookresearch / end-to-end-negotiator

Deal or No Deal? End-to-End Learning for Negotiation Dialogues
Other
1.38k stars 276 forks source link

Questions about arguments in reinforce script #31

Open Ulitochka opened 5 years ago

Ulitochka commented 5 years ago

Hello.

I have a few questions. I would be grateful if you answer them.

Could you please tell me what the argument is and what it affects? parser.add_argument('--smart_bob', action='store_true', default=False, help='make Bob smart again')

In Deal or No Deal? End-to-End Learning for Negotiation Dialogues in 6.1: During reinforcement learning, we use a learning rate of 0.1, clip gradients above 1.0, and use a discount factor of γ=0.95. But in reinforce.py: parser.add_argument('--gamma', type=float, default=0.99, help='discount factor'). It matters to learning?

Also in reinforce.py we see: 'parser.add_argument('--clip', type=float, default=0.1, help='gradient clip') In report: clip gradients above 1.0

Reinforce learning rate and gradient clip. In script default value is:
parser.add_argument('--rl_lr', type=float, default=0.002, help='RL learning rate') parser.add_argument('--rl_clip', type=float, default=2.0, help='RL gradient clip') In code snippet in readme file: --rl_lr 0.00001 \ --rl_clip 0.0001 \ It matters to learning?

How long does it take to execute the script reinforce with arguments from snippet?

During training, a large number of dialogs appear in which one of the agents repeats one word a large number of times. It' ok?

oliu-io commented 5 years ago

Were you able to address the problem and train successfully?

ayderdm commented 3 years ago

I have the same questions.