Open keiohta opened 5 years ago
Test code
# Generate trajectories
$ python examples/run_sac.py --env-name HalfCheetah-v2 --save-test-path --test-interval 50000 --gpu -1
$ ls results
20191220T185529.974847_SAC_
$ python examples/run_airl_sac.py --env-name HalfCheetah-v2 --test-interval 10000 --gpu -1 --expert-path-dir results/20191220T185529.974847_SAC_
hi @keiohta when I run $ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943SAC --gpu -1 --dir-suffix GAIfO
run_gaifo_ddpg.py: error: unrecognized arguments: --gpu -1
can you help me ? Thank you!
@haoyu-x Hi! Thanks for reporting the bug. I fixed the error on this commit, so can you try on the latest master branch again?
should I still use the same command suggested in issue 67? https://github.com/keiohta/tf2rl/issues/67
when I run python ~/tf2rl-master/examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943SAC --gpu -1 --dir-suffix GAIL same error.
On Sat, Jun 27, 2020 at 7:52 PM Kei Ohta notifications@github.com wrote:
@haoyu-x https://github.com/haoyu-x Hi! Thanks for reporting the bug. I fixed the error on this commit https://github.com/keiohta/tf2rl/commit/ab675d0e8f7061910e8f44d00daf72c69c72db6a, so can you try on the latest master branch again?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650550289, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZW5GAOYIOYFBJIKW23RYXMOLANCNFSM4HSDDXZQ .
Yeah, did you update the codes?
yes. I updated. Can you run gail and gaifo on your computer?
On Sat, Jun 27, 2020 at 9:13 PM Kei Ohta notifications@github.com wrote:
Yeah, did you update the codes?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650559655, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZUBM7ZIBNGBRSREIM3RYXV6FANCNFSM4HSDDXZQ .
At least I resolved the error of --gpu
.
Let me check whether full code runs.
Is there any other method to run gail and gaifo instead of the command line?
On Sat, Jun 27, 2020 at 9:15 PM Kei Ohta notifications@github.com wrote:
At least I resolved the error of --gpu. Let me check whether full code runs.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650559894, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZSL5LEL2TMTZBCXKITRYXWGTANCNFSM4HSDDXZQ .
I confirmed the script runs on my machine. Can you provide me with the full error message?
$ python examples/run_sac.py --env-name=HalfCheetah-v2 --save-test-path --test-interval=50000 --max-steps 300000
$ ls results
20200627T221712.423081_SAC_
$ find results/20200627T221712.423081_SAC_/ -name *.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_02_return_02744.1677.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_04_return_02701.9388.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_00_return_03121.5797.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_01_return_02784.6256.pkl
results/20200627T221712.423081_SAC_/step_00050000_epi_03_return_02752.4279.pkl
$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20200627T221712.423081_SAC_/ --gpu -1
...
22:23:48.107 [INFO] (irl_trainer.py:74) Total Epi: 19 Steps: 19000 Episode Steps: 1000 Return: 1174.4017 FPS: 118.79
22:23:56.162 [INFO] (irl_trainer.py:74) Total Epi: 20 Steps: 20000 Episode Steps: 1000 Return: 1889.9691 FPS: 124.15
22:23:57.861 [INFO] (irl_trainer.py:118) Evaluation Total Steps: 20000 Average Reward 2278.0820 over 5 episodes
[image: Screenshot from 2020-06-27 21-38-20.png]
On Sat, Jun 27, 2020 at 9:34 PM Kei Ohta notifications@github.com wrote:
I confirmed the script runs on my machine. Can you provide me with the full error message?
$ python examples/run_sac.py --env-name=HalfCheetah-v2 --save-test-path --test-interval=50000 --max-steps 300000 $ ls results 20200627T221712.423081SAC $ find results/20200627T221712.423081SAC/ -name *.pkl results/20200627T221712.423081SAC/step_00050000_epi_02_return_02744.1677.pkl results/20200627T221712.423081SAC/step_00050000_epi_04_return_02701.9388.pkl results/20200627T221712.423081SAC/step_00050000_epi_00_return_03121.5797.pkl results/20200627T221712.423081SAC/step_00050000_epi_01_return_02784.6256.pkl results/20200627T221712.423081SAC/step_00050000_epi_03_return_02752.4279.pkl
$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20200627T221712.423081SAC/ --gpu -1 ... 22:23:48.107 [INFO] (irl_trainer.py:74) Total Epi: 19 Steps: 19000 Episode Steps: 1000 Return: 1174.4017 FPS: 118.79 22:23:56.162 [INFO] (irl_trainer.py:74) Total Epi: 20 Steps: 20000 Episode Steps: 1000 Return: 1889.9691 FPS: 124.15 22:23:57.861 [INFO] (irl_trainer.py:118) Evaluation Total Steps: 20000 Average Reward 2278.0820 over 5 episodes
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650562023, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZQZU6PRI4UA6GBI2TLRYXYPZANCNFSM4HSDDXZQ .
Oh, I assumed you installed tf2rl on developer mode... I have not reflected my change on PyPI, so I do now.
sure. Please let me know what I should do after your change, Thank you a lot!
On Sat, Jun 27, 2020 at 9:39 PM Kei Ohta notifications@github.com wrote:
Oh, I assumed you installed tf2rl on developer mode... I have not reflected my change on PyPI, so I do now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650562564, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZUVWIQGBHFOU5EK4ETRYXZBZANCNFSM4HSDDXZQ .
Now, you can get the latest codes through PyPI. Can you try following?
# Update tf2rl
$ pip install -U tf2rl
# Make sure the version is 0.1.14
$ pip list | grep tf2rl
# Run your script
$ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943_SAC_ --gpu -1 --dir-suffix GAIfO
By the way, it seems that your path: ~/tf2rl-master
suggests that you did not install tf2rl using git clone
but you just download zip file, didn't you?
Anyway above command can detect the version, so please let me know if you still encounter the same problem.
problem fixed. But encountering another issue. :(
On Sat, Jun 27, 2020 at 9:48 PM Kei Ohta notifications@github.com wrote:
Now, you can get the latest codes through PyPI. Can you try following?
Update tf2rl
$ pip install -U tf2rl
Make sure the version is 0.1.14
$ pip list | grep tf2rl
Run your script
$ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943SAC --gpu -1 --dir-suffix GAIfO
By the way, it seems that your path: ~/tf2rl-master suggests that you did not install tf2rl using git clone but you just download zip file, didn't you? Anyway above command can detect the version, so please let me know if you still encounter the same problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650563507, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZTP44TSSKYFXIEKWF3RYX2EDANCNFSM4HSDDXZQ .
[image: Screenshot from 2020-06-27 21-54-05.png]
On Sat, Jun 27, 2020 at 9:53 PM Haoyu Xiong haoyux@berkeley.edu wrote:
problem fixed. But encountering another issue. :(
On Sat, Jun 27, 2020 at 9:48 PM Kei Ohta notifications@github.com wrote:
Now, you can get the latest codes through PyPI. Can you try following?
Update tf2rl
$ pip install -U tf2rl
Make sure the version is 0.1.14
$ pip list | grep tf2rl
Run your script
$ python ~/tf2rl-master/examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir ~/GAIL/results/20200619T013740.036943SAC --gpu -1 --dir-suffix GAIfO
By the way, it seems that your path: ~/tf2rl-master suggests that you did not install tf2rl using git clone but you just download zip file, didn't you? Anyway above command can detect the version, so please let me know if you still encounter the same problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650563507, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZTP44TSSKYFXIEKWF3RYX2EDANCNFSM4HSDDXZQ .
I cannot see your screenshot. Can you copy the message or retry uploading the picture?
sure.
21:56:03.468 [INFO] (irl_trainer.py:74) Total Epi: 7 Steps: 7000 Episode Steps: 1000 Return: -327.7823 FPS: 4416.74 21:56:03.713 [INFO] (irl_trainer.py:74) Total Epi: 8 Steps: 8000 Episode Steps: 1000 Return: -262.8208 FPS: 4088.41 21:56:03.955 [INFO] (irl_trainer.py:74) Total Epi: 9 Steps: 9000 Episode Steps: 1000 Return: -325.9061 FPS: 4149.77 21:56:04.268 [INFO] (irl_trainer.py:74) Total Epi: 10 Steps: 10000 Episode Steps: 1000 Return: -278.5830 FPS: 4176.82 Traceback (most recent call last): File "/home/haoyux/tf2rl-master/examples/run_gaifo_ddpg.py", line 43, in
I guess you collected the expert transitions on different environment (such as Pendulum-v0? because the state dimension of pendulum-v0 is 3). Are you sure the expert data are collected on HalfCheetah-v2?
OH! I made a stupid mistask. Thank you Kei, everything is fine now!
On Sat, Jun 27, 2020 at 10:24 PM Kei Ohta notifications@github.com wrote:
I guess you collected the expert transitions on different environment (such as Pendulum-v0? because the state dimension of pendulum-v0 is 3). Are you sure the expert data are collected on HalfCheetah-v2?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650567358, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZRGHFMBPALRGXXZE6DRYX6IFANCNFSM4HSDDXZQ .
one last question, how can I make a tensorboard figure like yours by command line? [image: Screenshot from 2020-06-27 22-28-13.png]
On Sat, Jun 27, 2020 at 10:26 PM Haoyu Xiong haoyux@berkeley.edu wrote:
OH! I made a stupid mistask. Thank you Kei, everything is fine now!
On Sat, Jun 27, 2020 at 10:24 PM Kei Ohta notifications@github.com wrote:
I guess you collected the expert transitions on different environment (such as Pendulum-v0? because the state dimension of pendulum-v0 is 3). Are you sure the expert data are collected on HalfCheetah-v2?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650567358, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZRGHFMBPALRGXXZE6DRYX6IFANCNFSM4HSDDXZQ .
It's great your script runs successfully! I cannot see your picture again... I just do:
$ tensorboard --logdir results
Does this answer your question?
I mean how can I visualize the training process using tensorboard. The figure is https://github.com/keiohta/tf2rl/issues/67
On Sat, Jun 27, 2020 at 10:38 PM Kei Ohta notifications@github.com wrote:
It's great your script runs successfully! I cannot see your picture again... I just do:
$ tensorboard --logdir results
Does this answer your question?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650568998, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZSWMA2M76SP3B2JWE3RYX74RANCNFSM4HSDDXZQ .
You can add suffix to a resulted directory by adding --dir-suffix
option. #67 uses it as:
$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix GAIL
$ python examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix GAIfO
$ python examples/run_vail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559_SAC_ --gpu -1 --dir-suffix VAIL
yes! thank you!
On Sat, Jun 27, 2020 at 10:47 PM Kei Ohta notifications@github.com wrote:
You can add suffix to a resulted directory by adding --dir-suffix option.
67 https://github.com/keiohta/tf2rl/issues/67 uses it as:
$ python examples/run_gail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559SAC --gpu -1 --dir-suffix GAIL $ python examples/run_gaifo_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559SAC --gpu -1 --dir-suffix GAIfO $ python examples/run_vail_ddpg.py --env-name=HalfCheetah-v2 --expert-path-dir results/20191213T203858.508559SAC --gpu -1 --dir-suffix VAIL
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650570163, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZXH2RFEPZ4ZPSAES23RYYBADANCNFSM4HSDDXZQ .
My pleasure! Please don't hesitate to open an issue if you encounter any difficulty or question. I close this issue. Thanks for the report!
OMG, this issue is not related to your question. So, I have to reopen this one. It would be better to open a new issue if it is not related to the original one ;)
thank you again!
On Sat, Jun 27, 2020 at 11:01 PM Kei Ohta notifications@github.com wrote:
OMG, this issue is not related to your question. So, I have to reopen this one. It would be better to open a new issue if it is not related to the original one ;)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650571897, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZWHX7KRFYN2KM4B2CDRYYCTNANCNFSM4HSDDXZQ .
Hi Kei,
I'm using tf2rl'gaifo on robosuite. https://github.com/gal-leibovich/robosuite. but there is an error: mujoco_py.builder.MujocoException: Unknown warning type Time = 1.3900.Check for NaN in simulation. I found out that my policy-net generates action [nan nan nan nan nan nan nan nan] after several episodes training. It happens on robosuite all the time, but works well on gym. I'm wondering if you can offer me some help. Thank you!
On Sat, Jun 27, 2020 at 11:04 PM Haoyu Xiong haoyux@berkeley.edu wrote:
thank you again!
On Sat, Jun 27, 2020 at 11:01 PM Kei Ohta notifications@github.com wrote:
OMG, this issue is not related to your question. So, I have to reopen this one. It would be better to open a new issue if it is not related to the original one ;)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keiohta/tf2rl/issues/36#issuecomment-650571897, or unsubscribe https://github.com/notifications/unsubscribe-auth/APACPZWHX7KRFYN2KM4B2CDRYYCTNANCNFSM4HSDDXZQ .
Hi, @haoyu-x
Could you open a new issue?
This is the issue where developpers track and discuss AIRL implementation.
For me, your problem is not related with the main topic of this issue.
Thanks @yamada-github-account , @haoyu-x and yes, I also think it would be better to open a new issue regarding this.
@keiohta I can't seem to find the run-airl-****.py files anywhere. Is this a commit issue? Am I missing something?
Hi @Aadit-Ambadkar , we haven't fully tested AIRL yet, but you can try it on different branch: https://github.com/keiohta/tf2rl/blob/airl/examples/run_airl_sac.py
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning