DeepX-inc / machina

Control section: Deep Reinforcement Learning framework
MIT License
278 stars 45 forks source link

[algos/qtopt] Make iterator called only once #211

Closed iory closed 5 years ago

iory commented 5 years ago

In current qtopt, iterator(traj.random_batch(batch_size, epoch)) is evaluated twice. This PR makes it call only once.

takerfume commented 5 years ago

@iory Thank you for fixing it! @rarilurelo It is OK except if we use QT-Opt for grasping like the original paper. In the original paper, exploration policies are switched if grad_step reaches a certain amount. Anyway, in machina, there is no need for grad_step.

rarilurelo commented 5 years ago

Thank you!