Open Minusadd opened 1 year ago
We have example scripts but found that the trigonometric functions return slightly different numbers on different OSes / platforms which end up cascading into larger errors if you try and playback the user recorded actions. That said, it could still work for behavioural cloning as we return the action taken and rewards at each time step. If you feel adventurous I can share them!
Hello,
Is there any instruction for using the human rollout data? E.g., how to load the data to run some learning from demonstrations training. Thanks!