Zero division error when train my custom env

🐛 Bug

My custom env is throwing an error when it is trained with A2C

Code example

import gym
import numpy as np

from stable_baselines3 import A2C
from stable_baselines3.common.env_checker import check_env

class CustomEnv(gym.Env):

  def __init__(self):
    super(CustomEnv, self).__init__()
    self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(14,))
    self.action_space = gym.spaces.Box(low=-1, high=1, shape=(6,))

  def reset(self):
    return self.observation_space.sample()

  def step(self, action):
    obs = self.observation_space.sample()
    reward = 1.0
    done = False
    info = {}
    return obs, reward, done, info

env = CustomEnv()
check_env(env)

model = A2C("MlpPolicy", env, verbose=1).learn(1000)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

System Info

OS: Linux-5.15.0-48-generic-x86_64-with-glibc2.35 #54-Ubuntu SMP Fri Aug 26 13:26:29 UTC 2022 Python: 3.10.6 Stable-Baselines3: 1.7.0a0 PyTorch: 1.12.1+cu102 GPU Enabled: False Numpy: 1.23.3 Gym: 0.21.0

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] I have provided a minimal working example to reproduce the bug
[X] I have checked my env using the env checker
[X] I've used the markdown code blocks for both code and stack traces.

qgallouedec / fake_repo_for_issue_form

Zero division error when train my custom env #4

🐛 Bug

Code example

System Info

Checklist