Closed m-rph closed 4 years ago
What do you mean by "if I try to retrace the trajectory"? Could you provide more detailed steps?
I can give you code actually
def replay(path, **other_params):
brain_params, brain_infos, _ = demo_loader.load_demonstration(str(path))
#some initialization and setting up
env.reset()
#starting from 1 because it has the previous_action
for binfo in brain_infos[1:]:
#process_info extracts the vector of the previous_action
_,_,_, info = process_info(binfo)
_,_,_, newinfo = env.step(info.previous_action)
My goal is to be able to take in a demonstration and take the same steps as those in the demo in order to return to the same location as the final step in it (the demo)
In this case you are assuming the environment will always be fixed. In that case if you take the exact same action for every steps, you will be able to retrace the demonstration.
However the obstacle tower environment is not a fixed environment. So you won't be able to retrace the demo unless you fix the seed that varies the generation of the environment(and maybe other things that might vary, for example physics in unity).
Yes I have it fixed to the same seed as the one from the demo. Hence I posted this as a bug.
Maybe there are something else you need to keep fixed.
This issue has been automatically marked as stale because it has not had activity in the last 14 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 28 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Describe the bug I am loading demonstrations from python using the
from mlagents.trainers.demo_loader
. If I try to retrace the trajectory, i.e. reset and select same action as what is in demo, the agent fails to follow the demonstration.To Reproduce Steps to reproduce the behavior:
Environment (please complete the following information):
NOTE: We are unable to help reproduce bugs with custom environments. Please attempt to reproduce your issue with one of the example environments, or provide a minimal patch to one of the environments needed to reproduce the issue.