Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
https://gymnasium.farama.org
MIT License
7.39k stars 836 forks source link

[Bug Report] Still hitting "too many values to unpack (expected 2)" for env.reset() with several recent versions (0.26.x) #453

Closed guillemrbaiges closed 1 year ago

guillemrbaiges commented 1 year ago

Describe the bug

I am hitting this error in all my attempts, after upgrading from gym 0.21 (I had a hard dependency on stables-baselines3) to any of the following versions (installing through Conda):

Code example

Not sure what code to share here and I feel code won't be very useful. Let me know if there's any code that could help you debugging this. Attaching error stack trace:

  ~/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/v11   master *2 !5 ?4 ❯ make run-train                                                                                                                   intro2research
python src/agent.py --data-file INC_253_1.png --num-rooms 4 \
    --reward-fn new_heuristic --test false --timesteps 100000  --step 10 --steps-per-episode 256
Traceback (most recent call last):
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/agent.py", line 52, in <module>
    run()
  File "/Users/guillem.rossello/opt/anaconda3/envs/intro2research/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/guillem.rossello/opt/anaconda3/envs/intro2research/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/guillem.rossello/opt/anaconda3/envs/intro2research/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/guillem.rossello/opt/anaconda3/envs/intro2research/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/agent.py", line 48, in run
    new_training(config=config, timesteps=timesteps, fname_model=fname_model)
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/training.py", line 16, in training
    trainer = TrainerRL(config)
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/trainer_rl.py", line 50, in __init__
    state = self.env.reset()
  File "/Users/guillem.rossello/opt/anaconda3/envs/intro2research/lib/python3.9/site-packages/gymnasium/core.py", line 457, in reset
    obs, info = self.env.reset(**kwargs)
ValueError: too many values to unpack (expected 2)
make: *** [run-train] Error 1

System info

Gymnasium installed with Conda, project is executed within this Conda environment. I'm on MacOS Monterey, using Python 3.9.13. Showing logs for my last attempt, but as mentioned earlier I've hit this same error with either gym 0.26.1, gymnasium 0.26.3 and gymnasium 0.27.1

  ~/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/v11   master *2 !5 ?4 ❯ conda list gym   intro2research

packages in environment at /Users/guillem.rossello/opt/anaconda3/envs/intro2research:

#

Name Version Build Channel

gymnasium 0.26.3 py39hecd8cb5_0 gymnasium-notices 0.0.1 py39hecd8cb5_0   ~/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/v11   master *2 !5 ?4 ❯ pip show gym   intro2research WARNING: Package(s) not found: gym

Additional context

Is this a Conda-related issue? I was using Conda because the project I joined was originally using stable-baselines3, but I really want to migrate to Poetry because honestly I see no advantages using Conda (and a lot of engineering pains). Is Poetry a good fit for Gymnasium?

Checklist

pseudo-rnd-thoughts commented 1 year ago

It looks like

  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/trainer_rl.py", line 50, in __init__
    state = self.env.reset()

which I suspect will be the return type, not state, info.

As for Poetry, this has been investigated here https://github.com/Farama-Foundation/Gymnasium/pull/86

guillemrbaiges commented 1 year ago

Hi, thanks for the quick reply! Not sure I understand what you mean, I did try both state = self.env.reset() and state, info = self.env.reset(), although I feel this is not what you mean.

guillemrbaiges commented 1 year ago

I migrated the project to Poetry, this is the dependency tree of the project:

click 8.1.3 Composable command line interface toolkit
└── colorama *
glob2 0.7 Version of the glob module that can capture patterns and supports recursive wildcards
gymnasium 0.28.1 A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym).
├── cloudpickle >=1.2.0
├── farama-notifications >=0.0.1
├── importlib-metadata >=4.8.0
│   └── zipp >=0.5 
├── jax-jumpy >=1.0.0
│   └── numpy >=1.18.0 
├── numpy >=1.21.0
└── typing-extensions >=4.3.0
matplotlib 3.7.1 Python plotting package
├── contourpy >=1.0.1
│   └── numpy >=1.16 
├── cycler >=0.10
├── fonttools >=4.22.0
├── importlib-resources >=3.2.0
│   └── zipp >=3.1.0 
├── kiwisolver >=1.0.1
├── numpy >=1.20
├── packaging >=20.0
├── pillow >=6.2.0
├── pyparsing >=2.3.1
└── python-dateutil >=2.7
    └── six >=1.5 
pygame 2.3.0 Python Game Development
torch 2.0.0 Tensors and Dynamic neural networks in Python with strong GPU acceleration
├── filelock *
├── jinja2 *
│   └── markupsafe >=2.0 
├── networkx *
├── nvidia-cublas-cu11 11.10.3.66
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cuda-cupti-cu11 11.7.101
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cuda-nvrtc-cu11 11.7.99
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cuda-runtime-cu11 11.7.99
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cudnn-cu11 8.5.0.96
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cufft-cu11 10.9.0.58
├── nvidia-curand-cu11 10.2.10.91
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cusolver-cu11 11.4.0.1
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-cusparse-cu11 11.7.4.91
│   ├── setuptools * 
│   └── wheel * 
├── nvidia-nccl-cu11 2.14.3
├── nvidia-nvtx-cu11 11.7.91
│   ├── setuptools * 
│   └── wheel * 
├── sympy *
│   └── mpmath >=0.19 
├── triton 2.0.0
│   ├── cmake * 
│   ├── filelock * 
│   ├── lit * 
│   └── torch * (circular dependency aborted here)
└── typing-extensions *

And I'm now hitting the following seed parameter error (that if I'm not mistaken was addressed in previous versions, not even gymnasium versions but gym versions!).

Traceback (most recent call last):
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/agent.py", line 62, in <module>
    run()
  File "/Users/guillem.rossello/Library/Caches/pypoetry/virtualenvs/rl-floorplan-VxTjqtjI-py3.9/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/guillem.rossello/Library/Caches/pypoetry/virtualenvs/rl-floorplan-VxTjqtjI-py3.9/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/guillem.rossello/Library/Caches/pypoetry/virtualenvs/rl-floorplan-VxTjqtjI-py3.9/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/guillem.rossello/Library/Caches/pypoetry/virtualenvs/rl-floorplan-VxTjqtjI-py3.9/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/agent.py", line 58, in run
    new_training(config=config, timesteps=timesteps, fname_model=fname_model)
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/training.py", line 16, in training
    trainer = TrainerRL(config)
  File "/Users/guillem.rossello/Desktop/UNI/Q2/IntroToResearch/RL-FloorPlan/V11/src/trainer_rl.py", line 50, in __init__
    state = self.env.reset()
  File "/Users/guillem.rossello/Library/Caches/pypoetry/virtualenvs/rl-floorplan-VxTjqtjI-py3.9/lib/python3.9/site-packages/gymnasium/core.py", line 462, in reset
    obs, info = self.env.reset(seed=seed, options=options)
TypeError: reset() got an unexpected keyword argument 'seed'

How can this even happen? I'm printing the gymnasium version right before the reset that triggers the error (the same line 50 in trainer_rl.py of the original error in this issue) and it is 0.28.1.

guillemrbaiges commented 1 year ago

Actually I made an error when I reported this "bug": with the originally explained Conda setup and Gymnasium 0.27.1 I hit this same seed error, not the one I shared at the beginning. Just sharing more data to see if we can find the cause of these issues. Thanks beforehand for the support.

pseudo-rnd-thoughts commented 1 year ago

I suspect that the environment was originally written in gym v0.21, I would use the gym compatibility wrapper in gymnasium https://gymnasium.farama.org/content/gym_compatibility/ Otherwise, I would read the migration guide https://gymnasium.farama.org/content/migration-guide/