avisingh599 / reward-learning-rl

[RSS 2019] End-to-End Robotic Reinforcement Learning without Reward Engineering
https://sites.google.com/view/reward-learning-rl/
Other
367 stars 68 forks source link

where is {SAC_CHECKPOINT_DIR} path #2

Closed leeivan1007 closed 5 years ago

leeivan1007 commented 5 years ago

Hi~~ some things make me confused I can run softlearning run_example_local examples.development after ten minutes, the new folder is /home/ivan/ray_results/multiworld/mujoco/Image48SawyerDoorPullHookEnv-v0/2019-05-09T11-56-23-2019-05-09T11-56-23 it have 1.log_syncadkyfm7w.log 2.params.pkl 3.result.json 4.params.json 5.progress.csv in folder(/dc7a2587-algorithm\=VICERAQ-seed\=7463_2019-05-09_11-56-239zwaq6rv/)

and I run: python examples/development/simulate_policy.py ~/ray_results/multiworld/mujoco/Image48SawyerDoorPullHookEnv-v0/2019-05-09T11-56-23-2019-05-09T11-56-23/dc7a2587-algorithm\=VICERAQ-seed\=7463_2019-05-09_11-56-239zwaq6rv/ --max-path-length=1000 --num-rollouts=1 --render-mode=human then output FileNotFoundError: [Errno 2] No such file or directory: '/home/ivan/ray_results/multiworld/mujoco/Image48SawyerDoorPullHookEnv-v0/2019-05-09T11-56-23-2019-05-09T11-56-23/params.json'

I cp the params.json to .. but it output FileNotFoundError: [Errno 2] No such file or directory: '/home/ivan/ray_results/multiworld/mujoco/Image48SawyerDoorPullHookEnv-v0/2019-05-09T11-56-23-2019-05-09T11-56-23/dc7a2587-algorithm=VICERAQ-seed=7463_2019-05-09_11-56-239zwaq6rv/checkpoint.pkl'

I think that I missed something or the some steps. thinks!

avisingh599 commented 5 years ago

Hmm, so the simulate_policy.py script is not yet supported in this code base (I carried it over from the softlearning repository, but there are currently come rendering issues in getting it to work). I do have a temporary solution for saving videos that I can share with you if you like, but I am planning to implement video logging soon, which will save the evaluation videos to the disk, allowing you to see what the policy is doing. I will keep this issue until I have implemented this.

avisingh599 commented 5 years ago

Coming to your specific error, I think the simulate_policy.py script requires both checkpoint.pkl and param.json. You don't currently have a checkpoint.pkl in your folder because the experiment as not finished running yet. If you want to store more checkpoints, you should run with the flag --checkpoint_frequency 1, and then pass in the path to the folder containing the checkpoint to simulate_policy.py.

leeivan1007 commented 5 years ago

think you, I added --checkpoint_frequency 1 and I had folder checkpoint_1, then I run python examples/development/simulate_policy.py ... It really appeared RuntimeError: Window rendering not supported

what is

a temporary solution for saving videos

it mean to add --video-save-frequency ?

I tried to add it and ran last day. it appeared RuntimeError: Window rendering not supported end appeared ray.tune.error.TuneError: ('Trials did not complete', [98d12e48-algorithm=VICERAQ-seed=2489, fe8b9e23-algorithm=VICERAQ-seed=3102, 7fc09f39-algorithm=VICERAQ-seed=1549, 66873c00-algorithm=VICERAQ-seed=3372, 9fc14227-algorithm=VICERAQ-seed=8789]) I think that I misunderstood it

avisingh599 commented 5 years ago

I meant to say that the temporary solution that I have for saving videos is not currently in the repository right now, but I will add it soon after cleaning it up a bit.

leeivan1007 commented 5 years ago

ok!

avisingh599 commented 5 years ago

I have now added some code that allows you to log videos while you train. Try running your code with the additional flag --video-save-frequency=1, and you should be able to see a videos folder in your experiment logs. Hope this resolves your issues of not being able to save videos! And feel free to reopen if it does not.

leeivan1007 commented 5 years ago

It is successful! thank you!

weijiafeng commented 5 years ago

@avisingh599 Just want to check that the codes with the video log is contained in the tag v0.1 repository? thanks!

avisingh599 commented 5 years ago

It's not in the v0.1 tag, it's in master.