[Question] The error about DQN--ep_len_mean&ep_rew_mean output

❓ Question

q: I found that by running dqn, the output of ep_len_mean&ep_rew_mean are the same. Why this happens? How can I solve this?

By running the example code:

import gymnasium as gym
from stable_baselines3 import DQN
env = gym.make("CartPole-v1", render_mode="human")
model = DQN("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000, log_interval=4)
model.save("dqn_cartpole")
del model # remove to demonstrate saving and loading
model = DQN.load("dqn_cartpole")
obs, info = env.reset()
while True:
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

From my perspective, ep_len_mean is Episode Length Mean, ep_rew_mean output denotes Episode Reward Mean. These two should not be the same. However, from the output of terminal:

and the output from tensorboard:

these two are the same.

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] If code there is, it is minimal and working
[X] If code there is, it is formatted using the markdown code blocks for both code and stack traces.

DLR-RM / stable-baselines3

[Question] The error about DQN--ep_len_mean&ep_rew_mean output #1918

❓ Question

Checklist