openai / spinningup

An educational resource to help anyone learn deep reinforcement learning.
https://spinningup.openai.com/
MIT License
9.96k stars 2.19k forks source link

PyTorch ppo and vpg crash on SpaceInvaders-v0 (OSX Catalina) #337

Open RamNathaniel opened 3 years ago

RamNathaniel commented 3 years ago

python -m spinup.run vpg --hid "[32,32]" --env SpaceInvaders-v0 --exp_name si-vpg-1 --gamma 0.999

Using default backend (pytorch) for vpg.

================================================================================ ExperimentGrid [si-vpg-1] runs over parameters:

gamma [gam]

0.999

env_name [env]

SpaceInvaders-v0

ac_kwargs:hidden_sizes [ac-hid]

[32, 32]

Variants, counting seeds: 1 Variants, not counting seeds: 1

================================================================================

Preparing to run the following experiments...

si-vpg-1

...

Saving config:

{ "ac_kwargs": {}, "actor_critic": "MLPActorCritic", "env_fn": "<function call_experiment..thunk_plus.. at 0x7ffb6d4362f0>", "epochs": 50, "exp_name": "si-vpg-1", "gamma": 0.999, "lam": 0.97, "logger": { "<spinup.utils.logx.EpochLogger object at 0x7ffb6d54bcc0>": { "epoch_dict": {}, "exp_name": "si-vpg-1", "first_row": true, "log_current_row": {}, "log_headers": [], "output_dir": "/Users/ramnathaniel/git/spinningup/spinningup/data/si-vpg-1/si-vpg-1_s0", "output_file": { "<_io.TextIOWrapper name='/Users/ramnathaniel/git/spinningup/spinningup/data/si-vpg-1/si-vpg-1_s0/progress.txt' mode='w' encoding='UTF-8'>": { "mode": "w" } } } }, "logger_kwargs": { "exp_name": "si-vpg-1", "output_dir": "/Users/ramnathaniel/git/spinningup/spinningup/data/si-vpg-1/si-vpg-1_s0" }, "max_ep_len": 1000, "pi_lr": 0.0003, "save_freq": 10, "seed": 0, "steps_per_epoch": 4000, "train_v_iters": 80, "vf_lr": 0.001 }

Number of parameters: pi: 18054, v: 17729

Traceback (most recent call last): File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/utils/run_entrypoint.py", line 11, in thunk() File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/utils/run_utils.py", line 162, in thunk_plus thunk(kwargs) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/algos/pytorch/vpg/vpg.py", line 274, in vpg a, v, logp = ac.step(torch.as_tensor(o, dtype=torch.float32)) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/algos/pytorch/vpg/core.py", line 128, in step pi = self.pi._distribution(obs) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/algos/pytorch/vpg/core.py", line 73, in _distribution logits = self.logits_net(obs) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, *kwargs) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, kwargs) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 87, in forward return F.linear(input, self.weight, self.bias) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/site-packages/torch/nn/functional.py", line 1372, in linear output = input.matmul(weight.t()) RuntimeError: size mismatch, m1: [33600 x 3], m2: [210 x 64] at ../aten/src/TH/generic/THTensorMath.cpp:197

================================================================================

There appears to have been an error in your experiment.

Check the traceback above to see what actually went wrong. The traceback below, included for completeness (but probably not useful for diagnosing the error), shows the stack leading up to the experiment launch.

================================================================================

Traceback (most recent call last): File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/run.py", line 248, in parse_and_execute_grid_search(cmd, args) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/run.py", line 180, in parse_and_execute_grid_search eg.run(algo, run_kwargs) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/utils/run_utils.py", line 546, in run data_dir=data_dir, datestamp=datestamp, var) File "/Users/ramnathaniel/git/spinningup/spinningup/spinup/utils/run_utils.py", line 171, in call_experiment subprocess.check_call(cmd, env=os.environ) File "/Users/ramnathaniel/miniconda3/envs/spinningup/lib/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/Users/ramnathaniel/miniconda3/envs/spinningup/bin/python', '/Users/ramnathaniel/git/spinningup/spinningup/spinup/utils/run_entrypoint.py', 'eJytUz1v1DAYdi7pNVyhVFRqRdWN5SrUXBnYShlOokIndWiBDVm+2JeEOHZI7CsdkJBKe1fJYqlh4T+x8ysYWXmd65c6gUSiJK/fLz/Pk9efgq9vAtRcZjXmUtMyi3POohu2NYt4lHGOR1rEKpPCnttuisx9XJCc4TpnsxC4wXsPD3XGVSawOioZ+EzYl5S9cosTu2+7A9Tc/mC5/4J601berrZoiyKOzmBFvYfoDE3QxBu1qE+DzyH45raRizxDypv6x56HpsEIPF/AougA2e6eCZkYY0EKZgcovXO5SQs26Uy9d2iKjj1I3DuxJnCoAcqG2e29rllV9ypSCKJSIjLGe0mmenWZCZGJRJe3TfholXGo0QI3VlQeAcVtToohJTt28L2PbLpg/OSosKd2QwFl8zQmIB/7ULIqK5hQ0TaX4Kp3IpVqkeOS6/rad9XLtB2pkbDKpssmLMoMj2SVu7bpkulc19rBt/4iaoWeux94bT/0oTg/JFVSWzMvdIHjUlsz15TYCQBMl0+twzbwP9quWcC4JHFOEoaxNXdnTKOGH5DDjbAusnIzEl1p0OTAiDQ5/01VPQF8Dt18wuUQpIFFunYLgxNFSQnBdO3c1iakbEQ0V7XdMwHNYgVFZhGmudYVw2PCNavtW9t1fecSUhTE7j7/9fvHz5XJY/htSwegA3spxoQCh83xFsxLzRiFqTL3uEwSVuELXV2HjtSq1ArTrPpn4pQo0quzzXGZbD65MnANW4YwKrNhNuFlwGo98Mz6BXnCE+lEUrKK0wgS3GON797nlpl2IamGo5tumKBplK4bn8oYVFnFN842rvVwlupUMQtECKmIO+SOoAnfa8JnQB79xQxb08kPr3+AVvvRH8IToUQ=']' returned non-zero exit status 1.

Alberto-Hache commented 2 years ago

Hi @RamNathaniel, the reason your experiment is crashing is that the environment you chose generates images as an observation ("an array of shape (210, 160, 3)" according to Gym's documentation here: https://gym.openai.com/envs/SpaceInvaders-v0/), and Spinning Up only supports flat vector inputs at this point.

Your alternatives are:

a) Use the alternative environment "SpaceInvaders-ram-v0", in which "the observation is the RAM of the Atari machine, consisting of (only!) 128 bytes". https://gym.openai.com/envs/SpaceInvaders-ram-v0/

b) Add a CNN layer to your model, other than current "MLPActorCritic". Here's a couple of references of people who implemented it on PPO and SAC: https://github.com/mahyaret/spinningup https://github.com/ac-93/soft-actor-critic

Good luck!