rlbayes / rllabplusplus

Other
161 stars 42 forks source link

Error running any algorithm #3

Open ViktorM opened 7 years ago

ViktorM commented 7 years ago

With every algorithm I tried to run I get the same error:

rllabplusplus\sandbox\rocky\tf\launchers>python algo_gym_stub.py --algo_name=qprop --env=CartPole-v0 n_parallel=2
2017-06-06 19:09:57.032793 Pacific Daylight Time | Warning: skipping Gym environment monitoring since snapshot_dir not configured.
[2017-06-06 19:09:57,033] Making new env: CartPole-v0
2017-06-06 19:09:57.039383 Pacific Daylight Time | observation space: Box(4,)
2017-06-06 19:09:57.040060 Pacific Daylight Time | action space: Discrete(2)
Creating algo=qprop with n_itr=2000, max_path_length=200...
python D:\Stanford\Deep_RL\rllabplusplus\scripts/run_experiment_lite.py  --log_dir 'D:\Stanford\Deep_RL\rllabplusplus/data/local/default/CartPole-v0-5000--al-qprop--gl-0-97--qeo-ones--qhn-relu--qhs-100--qlr-0-001--qur-1-0--rps-1000000--sr-1-0--ss-0-01--s-1'  --args_data 'gANjcmxsYWIubWlzYy5pbnN0cnVtZW50ClN0dWJNZXRob2RDYWxsCnEAKYFxAX1xAihYBgAAAF9fYXJnc3EDKGNybGxhYi5taXNjLmluc3RydW1lbnQKU3R1Yk9iamVjdApxBCmBcQV9cQYoWAQAAABhcmdzcQcpWAsAAABwcm94eV9jbGFzc3EIY3NhbmRib3gucm9ja3kudGYuYWxnb3MudHJwbwpUUlBPCnEJWAYAAABrd2FyZ3NxCn1xCyhYEAAAAHFmX3VwZGF0ZXNfcmF0aW9xDEc/8AAAAAAAAFgNAAAAcXByb3BfbWluX2l0cnENSwBYCgAAAGdhZV9sYW1iZGFxDkc/7wo9cKPXClgOAAAAc2FtcGxlX2JhY2t1cHNxD0sAWAMAAABlbnZxEGgEKYFxEX1xEihoByloCGNzYW5kYm94LnJvY2t5LnRmLmVudnMuYmFzZQpUZkVudgpxE2gKfXEUWAsAAAB3cmFwcGVkX2VudnEVaAQpgXEWfXEXKGgHKWgIY3JsbGFiLmVudnMubm9ybWFsaXplZF9lbnYKTm9ybWFsaXplZEVudgpxGGgKfXEZKGgQaAQpgXEafXEbKGgHKWgIY3JsbGFiLmVudnMuZ3ltX2VudgpHeW1FbnYKcRxoCn1xHShYDAAAAHJlY29yZF92aWRlb3EeiVgIAAAAZW52X25hbWVxH1gLAAAAQ2FydFBvbGUtdjBxIFgKAAAAcmVjb3JkX2xvZ3EhiXV1YlgNAAAAbm9ybWFsaXplX29ic3EiiXV1YnN1YlgQAAAAcmVwbGF5X3Bvb2xfc2l6ZXEjSkBCDwBYDAAAAHNjYWxlX3Jld2FyZHEkRz/wAAAAAAAAWAYAAABwb2xpY3lxJWgEKYFxJn1xJyhoByloCGNzYW5kYm94LnJvY2t5LnRmLnBvbGljaWVzLmdhdXNzaWFuX21scF9wb2xpY3kKR2F1c3NpYW5NTFBQb2xpY3kKcShoCn1xKShYCAAAAGVudl9zcGVjcSpjcmxsYWIubWlzYy5pbnN0cnVtZW50ClN0dWJBdHRyCnErKYFxLH1xLShYBAAAAF9vYmpxLmgRWAoAAABfYXR0cl9uYW1lcS9YBAAAAHNwZWNxMHViWAQAAABuYW1lcTFoJVgMAAAAaGlkZGVuX3NpemVzcTJLZEsySxmHcTNYEwAAAGhpZGRlbl9ub25saW5lYXJpdHlxNGN0ZW5zb3JmbG93LnB5dGhvbi5vcHMubWF0aF9vcHMKdGFuaApxNXV1YlgIAAAAZGlzY291bnRxNkc/764UeuFHrlgJAAAAc3RlcF9zaXplcTdHP4R64UeuFHtYCAAAAGJhc2VsaW5lcThoBCmBcTl9cTooaAcpaAhjcmxsYWIuYmFzZWxpbmVzLmxpbmVhcl9mZWF0dXJlX2Jhc2VsaW5lCkxpbmVhckZlYXR1cmVCYXNlbGluZQpxO2gKfXE8aCpoKymBcT19cT4oaC5oEWgvaDB1YnN1YlgFAAAAbl9pdHJxP03QB1gLAAAAcWZfYmFzZWxpbmVxQGgEKYFxQX1xQihoByloCGNzYW5kYm94LnJvY2t5LnRmLmJhc2VsaW5lcy5xX2Jhc2VsaW5lClFmdW5jdGlvbkJhc2VsaW5lCnFDaAp9cUQoaCpoKymBcUV9cUYoaC5oEWgvaDB1YlgCAAAAcWZxR2gEKYFxSH1xSShoByloCGNzYW5kYm94LnJvY2t5LnRmLnFfZnVuY3Rpb25zLmNvbnRpbnVvdXNfbWxwX3FfZnVuY3Rpb24KQ29udGludW91c01MUFFGdW5jdGlvbgpxSmgKfXFLKGgqaCspgXFMfXFNKGguaBFoL2gwdWJoMktkS2SGcU5oNGN0ZW5zb3JmbG93LnB5dGhvbi5vcHMuZ2VuX25uX29wcwpyZWx1CnFPdXViaCVoJnV1YlgQAAAAcXByb3BfZXRhX29wdGlvbnFQWAQAAABvbmVzcVFYDQAAAHFmX2JhdGNoX3NpemVxUktAWBAAAABxZl9sZWFybmluZ19yYXRlcVNHP1BiTdLxqfxYDwAAAG1heF9wYXRoX2xlbmd0aHFUS8hoR2hIWA0AAABtaW5fcG9vbF9zaXplcVVN6ANYEAAAAHJlcGxhY2VtZW50X3Byb2JxVkc/8AAAAAAAAFgVAAAAcXByb3BfdXNlX3FmX2Jhc2VsaW5lcVeJWAoAAABiYXRjaF9zaXplcVhNiBN1dWJYBQAAAHRyYWlucVkpfXFadHFbWAgAAABfX2t3YXJnc3FcfXFddWIu'  --seed '1'  --snapshot_mode 'last_best'  --n_parallel '1'  --use_cloudpickle 'False'  --exp_name 'CartPole-v0-5000--al-qprop--gl-0-97--qeo-ones--qhn-relu--qhs-100--qlr-0-001--qur-1-0--rps-1000000--sr-1-0--ss-0-01'
usage: run_experiment_lite.py [-h] [--n_parallel N_PARALLEL]
                              [--exp_name EXP_NAME] [--log_dir LOG_DIR]
                              [--snapshot_mode SNAPSHOT_MODE]
                              [--snapshot_gap SNAPSHOT_GAP]
                              [--tabular_log_file TABULAR_LOG_FILE]
                              [--text_log_file TEXT_LOG_FILE]
                              [--params_log_file PARAMS_LOG_FILE]
                              [--variant_log_file VARIANT_LOG_FILE]
                              [--resume_from RESUME_FROM] [--plot PLOT]
                              [--log_tabular_only LOG_TABULAR_ONLY]
                              [--seed SEED] [--args_data ARGS_DATA]
                              [--variant_data VARIANT_DATA]
                              [--use_cloudpickle USE_CLOUDPICKLE]
run_experiment_lite.py: error: argument --seed: invalid int value: "'1'"

Will be thankful for any advice how it can be fixed and how I can run DDPG and Q-prop experiments.

rlbayes commented 7 years ago

Hi, I haven't seen such an error before. I only tested my codes on Mac and Ubuntu. Could you provide more information?

I also plan to push the updated codebase in a week or two, which accounts for updates done on openai/rllab & gym, and includes refactoring/new codes. If you are not urgent, you could try after then. Thank you for reporting.

On Wed, Jun 7, 2017 at 10:12 AM, ViktorM notifications@github.com wrote:

With every algorithm I tried to run I get the same error:

rllabplusplus\sandbox\rocky\tf\launchers>python algo_gym_stub.py --algo_name=qprop --env=CartPole-v0 n_parallel=2 2017-06-06 19:09:57.032793 Pacific Daylight Time | Warning: skipping Gym environment monitoring since snapshot_dir not configured. [2017-06-06 19:09:57,033] Making new env: CartPole-v0 2017-06-06 19:09:57.039383 Pacific Daylight Time | observation space: Box(4,) 2017-06-06 19:09:57.040060 Pacific Daylight Time | action space: Discrete(2) Creating algo=qprop with n_itr=2000, max_path_length=200... python D:\Stanford\Deep_RL\rllabplusplus\scripts/run_experiment_lite.py --log_dir 'D:\Stanford\Deep_RL\rllabplusplus/data/local/default/CartPole-v0-5000--al-qprop--gl-0-97--qeo-ones--qhn-relu--qhs-100--qlr-0-001--qur-1-0--rps-1000000--sr-1-0--ss-0-01--s-1' --args_data 'gANjcmxsYWIubWlzYy5pbnN0cnVtZW50ClN0dWJNZXRob2RDYWxsCnEAKYFxAX1xAihYBgAAAF9fYXJnc3EDKGNybGxhYi5taXNjLmluc3RydW1lbnQKU3R1Yk9iamVjdApxBCmBcQV9cQYoWAQAAABhcmdzcQcpWAsAAABwcm94eV9jbGFzc3EIY3NhbmRib3gucm9ja3kudGYuYWxnb3MudHJwbwpUUlBPCnEJWAYAAABrd2FyZ3NxCn1xCyhYEAAAAHFmX3VwZGF0ZXNfcmF0aW9xDEc/8AAAAAAAAFgNAAAAcXByb3BfbWluX2l0cnENSwBYCgAAAGdhZV9sYW1iZGFxDkc/7wo9cKPXClgOAAAAc2FtcGxlX2JhY2t1cHNxD0sAWAMAAABlbnZxEGgEKYFxEX1xEihoByloCGNzYW5kYm94LnJvY2t5LnRmLmVudnMuYmFzZQpUZkVudgpxE2gKfXEUWAsAAAB3cmFwcGVkX2VudnEVaAQpgXEWfXEXKGgHKWgIY3JsbGFiLmVudnMubm9ybWFsaXplZF9lbnYKTm9ybWFsaXplZEVudgpxGGgKfXEZKGgQaAQpgXEafXEbKGgHKWgIY3JsbGFiLmVudnMuZ3ltX2VudgpHeW1FbnYKcRxoCn1xHShYDAAAAHJlY29yZF92aWRlb3EeiVgIAAAAZW52X25hbWVxH1gLAAAAQ2FydFBvbGUtdjBxIFgKAAAAcmVjb3JkX2xvZ3EhiXV1YlgNAAAAbm9ybWFsaXplX29ic3EiiXV1YnN1YlgQAAAAcmVwbGF5X3Bvb2xfc2l6ZXEjSkBCDwBYDAAAAHNjYWxlX3Jld2FyZHEkRz/wAAAAAAAAWAYAAABwb2xpY3lxJWgEKYFxJn1xJyhoByloCGNzYW5kYm94LnJvY2t5LnRmLnBvbGljaWVzLmdhdXNzaWFuX21scF9wb2xpY3kKR2F1c3NpYW5NTFBQb2xpY3kKcShoCn1xKShYCAAAAGVudl9zcGVjcSpjcmxsYWIubWlzYy5pbnN0cnVtZW50ClN0dWJBdHRyCnErKYFxLH1xLShYBAAAAF9vYmpxLmgRWAoAAABfYXR0cl9uYW1lcS9YBAAAAHNwZWNxMHViWAQAAABuYW1lcTFoJVgMAAAAaGlkZGVuX3NpemVzcTJLZEsySxmHcTNYEwAAAGhpZGRlbl9ub25saW5lYXJpdHlxNGN0ZW5zb3JmbG93LnB5dGhvbi5vcHMubWF0aF9vcHMKdGFuaApxNXV1YlgIAAAAZGlzY291bnRxNkc/764UeuFHrlgJAAAAc3RlcF9zaXplcTdHP4R64UeuFHtYCAAAAGJhc2VsaW5lcThoBCmBcTl9cTooaAcpaAhjcmxsYWIuYmFzZWxpbmVzLmxpbmVhcl9mZWF0dXJlX2Jhc2VsaW5lCkxpbmVhckZlYXR1cmVCYXNlbGluZQpxO2gKfXE8aCpoKymBcT19cT4oaC5oEWgvaDB1YnN1YlgFAAAAbl9pdHJxP03QB1gLAAAAcWZfYmFzZWxpbmVxQGgEKYFxQX1xQihoByloCGNzYW5kYm94LnJvY2t5LnRmLmJhc2VsaW5lcy5xX2Jhc2VsaW5lClFmdW5jdGlvbkJhc2VsaW5lCnFDaAp9cUQoaCpoKymBcUV9cUYoaC5oEWgvaDB1YlgCAAAAcWZxR2gEKYFxSH1xSShoByloCGNzYW5kYm94LnJvY2t5LnRmLnFfZnVuY3Rpb25zLmNvbnRpbnVvdXNfbWxwX3FfZnVuY3Rpb24KQ29udGludW91c01MUFFGdW5jdGlvbgpxSmgKfXFLKGgqaCspgXFMfXFNKGguaBFoL2gwdWJoMktkS2SGcU5oNGN0ZW5zb3JmbG93LnB5dGhvbi5vcHMuZ2VuX25uX29wcwpyZWx1CnFPdXViaCVoJnV1YlgQAAAAcXByb3BfZXRhX29wdGlvbnFQWAQAAABvbmVzcVFYDQAAAHFmX2JhdGNoX3NpemVxUktAWBAAAABxZl9sZWFybmluZ19yYXRlcVNHP1BiTdLxqfxYDwAAAG1heF9wYXRoX2xlbmd0aHFUS8hoR2hIWA0AAABtaW5fcG9vbF9zaXplcVVN6ANYEAAAAHJlcGxhY2VtZW50X3Byb2JxVkc/8AAAAAAAAFgVAAAAcXByb3BfdXNlX3FmX2Jhc2VsaW5lcVeJWAoAAABiYXRjaF9zaXplcVhNiBN1dWJYBQAAAHRyYWlucVkpfXFadHFbWAgAAABfX2t3YXJnc3FcfXFddWIu' --seed '1' --snapshot_mode 'last_best' --n_parallel '1' --use_cloudpickle 'False' --exp_name 'CartPole-v0-5000--al-qprop--gl-0-97--qeo-ones--qhn-relu--qhs-100--qlr-0-001--qur-1-0--rps-1000000--sr-1-0--ss-0-01' usage: run_experiment_lite.py [-h] [--n_parallel N_PARALLEL] [--exp_name EXP_NAME] [--log_dir LOG_DIR] [--snapshot_mode SNAPSHOT_MODE] [--snapshot_gap SNAPSHOT_GAP] [--tabular_log_file TABULAR_LOG_FILE] [--text_log_file TEXT_LOG_FILE] [--params_log_file PARAMS_LOG_FILE] [--variant_log_file VARIANT_LOG_FILE] [--resume_from RESUME_FROM] [--plot PLOT] [--log_tabular_only LOG_TABULAR_ONLY] [--seed SEED] [--args_data ARGS_DATA] [--variant_data VARIANT_DATA] [--use_cloudpickle USE_CLOUDPICKLE] run_experiment_lite.py: error: argument --seed: invalid int value: "'1'"

Will be thankful for any advice how it can be fixed and how I can run DDPG and Q-prop experiments.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/shaneshixiang/rllabplusplus/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/AAU8faPu67KmI2izsIwiweCSHTFAYvXBks5sBgcbgaJpZM4NyHyU .

-- Shane Gu

ViktorM commented 7 years ago

I was testing on a Windows 10 in a forked branch: https://github.com/ViktorM/rllabplusplus where I've done updates to the new gym env version and TensorFlow 1.1 and after testing planned to make pull request to the trunk. And then test and compare Q-prop vs DDPG vs TRPO vs TRPO Recurrent on my RL locomotion tasks.

Upgrade to TF 1.1 was succesfull - trpo_cartpole.py, trpo_cartpole_recurrent.py and vpg.py in sandbox/rocky/tf /are successfully working and training, but with slight modification similar to the rllab code - algo.train() call is used at the end instead of run_experiment_lite()

But any training script that involves calling run_experiment_lite() either TF or Theano ends up with the same error I've posted. above. So if it doesn't take a lot of time I'll be thankful for any hints how it can be fixed and I can make a local change to start experiments with Q-prop a bit earlier than in 2 weeks. In other case I'll wait for your updates, sounds like they are very cool and my changes - upgrading to the latest TF version and new gym are only a small subset of them :)

Can you also to make a quick test also on Windows 10? Installing for me was pretty straightforward using dependencies listed in environment.yml with some minor changes, for example there was no package 64-bit for pybox2d I should have taken it from here: http://www.lfd.uci.edu/~gohlke/pythonlibs/#pybox2d

ViktorM commented 7 years ago

I suppose new code will be related to this paper: https://arxiv.org/abs/1706.00387 ? :)

rlbayes commented 7 years ago

Hi, thank you for clarifying the settings. I unfortunately don't have set ups right now to test on Windows 10 myself. But that could explain the error. I assume you tested the openai/rllab and that has worked? If yes, then with the next codes I plan to push, synced with openai/rllab should solve the problem.

Yes, it will includes codes for https://arxiv.org/abs/1706.00387.

On Thu, Jun 8, 2017 at 12:26 AM, ViktorM notifications@github.com wrote:

I suppose new code will be related to this paper: https://arxiv.org/abs/1706.00387 ? :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/shaneshixiang/rllabplusplus/issues/3#issuecomment-306849040, or mute the thread https://github.com/notifications/unsubscribe-auth/AAU8fVfS2xDd6DJ_zofdmkP8Irs631LIks5sBs8ugaJpZM4NyHyU .

-- Shane Gu

ViktorM commented 7 years ago

Rllab has totally the same problem. I opened an issue there too and have recieved couple of ideas to try.

ViktorM commented 7 years ago

Hi,

Do you have any updates on when the new code will be released? Looking forward to test it for some locomotion problems and robotic control tasks.

rlbayes commented 7 years ago

Hi,

Sorry. We decided to clean up codes more before the release. The plan is in one week or two.

Shane

On Sat, Jul 15, 2017 at 4:55 PM, Viktor Makoviychuk < notifications@github.com> wrote:

Hi,

Do you have any updates on when the new code will be released? Looking forward to test it for some locomotion problems and robotic control tasks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/shaneshixiang/rllabplusplus/issues/3#issuecomment-315571311, or mute the thread https://github.com/notifications/unsubscribe-auth/AAU8fTxOERCDOrUocvN4_HvFNE7d0orCks5sOVF7gaJpZM4NyHyU .

-- Shane Gu

ViktorM commented 7 years ago

Thanks Shane!

JackieTseng commented 6 years ago

Hi,

Is there any plan about releasing Q-prop source code? Also looking forward to test in my DDPG. It has been a long time since the publication of the paper :)

THX.

rlbayes commented 6 years ago

Hi,

Q-Prop code is already part of rllabplusplus. Variants of algorithms available to run are listed here: https://github.com/shaneshixiang/rllabplusplus/blob/master/sandbox/rocky/tf/launchers/launcher_utils.py (from line 135).

I may improve documentations at future date.

I recall I have merged with the most recent rllab in the new commit, and there may be some problems. Please inform me if there are.

Thanks!

Shane

On Sun, Sep 10, 2017 at 2:07 AM, Jackie Tseng notifications@github.com wrote:

Hi,

Is there any plan about releasing Q-prop source code? Also looking forward to test in my DDPG. It has been a long time since the publication of the paper :)

THX.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/shaneshixiang/rllabplusplus/issues/3#issuecomment-328329689, or mute the thread https://github.com/notifications/unsubscribe-auth/AAU8fYeQyQND3qIq248wX-c0HLghP1bjks5sg6a_gaJpZM4NyHyU .

-- Shane Gu