rr-learning / CausalWorld

CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning
https://sites.google.com/view/causal-world/home
MIT License
210 stars 26 forks source link

computation of PD gains and save-state #81

Closed martius-lab closed 4 years ago

martius-lab commented 4 years ago

'_latest_full_state' is used for computing the pd feedbacks but is updated at different places in the code which might cause trouble.

For a more consistent data flow, maybe it would be good to update the state only once also higher up the calling stack.

I suggest to update it once at the beginning of do_simulation (as a copy of the current state). See also the other issue about the endeffector position.

Also, the _latest_full_state should be stored in 'save_state' and 'restore_state' from 'task'. Otherwise, the PD controller is in a weird state at a restoration.