openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.47k stars 8.59k forks source link

[Bug Report] CartPole env.render() kills JupyterLab kernel #3031

Open austinmw opened 2 years ago

austinmw commented 2 years ago

Describe the bug env.render() kills my JupyterLab kernel.

Code example

import random
import numpy as np
import gym
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

env = gym.make('CartPole-v1')
states = env.observation_space.shape[0]
actions = env.action_space.n

episodes = 10
for episode in range(1, episodes+1):
    state = env.reset()
    done = False
    score = 0

    while not done:
        env.render()
        action = random.choice([0,1])
        state, reward, done, info = env.step(action)
        score += reward
    print(f'Episode: {episode}, score: {score}')

Results in:

Kernel Restarting The kernel for Deep Reinforcement Learning.ipynb appears to have died. It will restart automatically.

However if I comment out env.render() it runs successfully.

System Info

pseudo-rnd-thoughts commented 2 years ago

Thanks for the issue, I can replicate the issue on mac with the master branch of gym. Even when closing the environment or adding render_mode="human", this still causes an issue

import gym

env = gym.make('CartPole-v1')

state = env.reset()
done = False
score = 0

while not done:
    env.render()
    action = env.action_space.sample()
    state, reward, done, info = env.step(action)
    score += reward
print(score)
env.close()

Issue https://github.com/openai/gym/issues/2677 has the same problem.

RedTachyon commented 2 years ago

Is the issue still present if you do img = env.render("rgb_array")? My guess is that it's yet another issue with the human rendering, but I thought that pygame was nicer than pyglet for that

austinmw commented 2 years ago

@RedTachyon That does prevent the kernel crash (although is practically speaking not a usable alternative since in a loop it updates far too slowly and brings my program to a crawl)

(e.g.)

while not done:
  #env.render()
  img = env.render("rgb_array")
  plt.imshow(img)
  display.clear_output(wait=True)
  display.display(pl.gcf())
RedTachyon commented 2 years ago

So I just tested a basic loop

env = gym.make("CartPole-v1", new_step_api=True)

env.reset()

for _ in range(100):
    env.step(env.action_space.sample())
    env.render()

and it works without any issues. My setup is Mac + gym 0.25.2 + jupyterlab 3.2.0.

I'm guessing the issue might be with the local setup or something? Or maybe jupyterlab messed something up in the meantime?

RedTachyon commented 2 years ago

@austinmw Do you maybe happen to be running the code on a headless machine?

austinmw commented 2 years ago

@RedTachyon I did try that first using X11 to EC2 instance and thought that might be it, but then I tried directly on my ubuntu desktop and experienced the same issue

RedTachyon commented 2 years ago

Is there some more detailed error message you can provide? I know jupyter sometimes dumps some logs/stack trace in the terminal where you're running the jupyter instance.

I don't have access to a proper ubuntu desktop at the moment, but on a headless ubuntu 20 machine with gym 0.25.2 I'm getting the "expected" error ("error: video system not initialized"), but no kernels have been harmed

austinmw commented 2 years ago

@RedTachyon Ah, it looks like the the combination of new_step_api=True and render_mode='human' fixed it for me, working now, thanks for your help!

RedTachyon commented 2 years ago

@austinmw Glad that it works now, but I have no idea why it would make a difference. In any case, the API switching is on the way out for the next release, so unless we see this happening in the new version, we can probably ignore the issue.

ash-mac commented 2 years ago

Sorry, if this isn't the right place, but similar issue is faced in FozenLake8x8-v1 and Taxi-v3 Environments even after doing what fixed @austinmw 's error

pseudo-rnd-thoughts commented 2 years ago

The fix is merged in the master branch but is not in a release yet, it will be v26

RedTachyon commented 2 years ago

@pseudo-rnd-thoughts Which fix do you mean? I don't recall this having been figured out, there's just the hope that if the issue is indeed due to the API switching (which is odd), then it will go away by itself.

pseudo-rnd-thoughts commented 2 years ago

Ohh, sorry I must be getting this bug confused with https://github.com/openai/gym/pull/3037

jkterry1 commented 1 year ago

Hey, we just launched gymnasium, a fork of Gym by the maintainers of Gym for the past 18 months where all maintenance and improvements will happen moving forward. Could you please move this over to the new repo?

If you'd like to read more about the story behind the backstory behind this and our plans going forward, click here.

bjoaquin commented 1 year ago

I'm new to this subject and I'm probably much too late anyway, but is it possible that the problem with @austinmw 's code is that it lacks an env.close() after the last line?