I'm doing some experiments using this dataset, and I'm finding a strange issue when trying to reproduce results, where if I roll out the dataset's sequence of actions on a "walker2d-expert-v2" environment, starting from the same initial state, I do not observe the dataset's qpos and qvel. I think my findings can be shown by this script, which shows the differences between the expected observations and the recorded ones:
Question
Hello everyone,
I'm doing some experiments using this dataset, and I'm finding a strange issue when trying to reproduce results, where if I roll out the dataset's sequence of actions on a "walker2d-expert-v2" environment, starting from the same initial state, I do not observe the dataset's qpos and qvel. I think my findings can be shown by this script, which shows the differences between the expected observations and the recorded ones:
The output of this script yields the following values for me:
Here, my expectation is that all "diff" values should be zero, if I am using the dataset properly.
Any advice is welcome!