datawhalechina / easy-rl

强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
Other
9.04k stars 1.81k forks source link

1.7.1 Gym示例 返回值增多了 #126

Closed neverevergiveup closed 1 year ago

neverevergiveup commented 1 year ago

原代码:

import gym  
env = gym.make('CartPole-v0')  
env.reset()  
for _ in range(1000):
    env.render()  
    action = env.action_space.sample() 
    observation, reward, done, info = env.step(action)
    print(observation)
env.close()    

报错:

ValueError Traceback (most recent call last) Cell In [7], line 8 6 action = env.action_space.sample() # 从动作空间中选取一个动作 7 print(action) ----> 8 observation, reward, done, info = env.step(action) # 用于提交动作,括号内是具体的动作 9 print(observation) 10 env.close

ValueError: too many values to unpack (expected 4)

实际的返回值应该是5个: print(env.step(action)) (array([-0.02054064, -0.15305778, 0.04347553, 0.29886863], dtype=float32), 1.0, False, False, {})

qiwang067 commented 1 year ago

这是 Gym 版本问题,前文交代过,具体见下图: image

原代码:

import gym  
env = gym.make('CartPole-v0')  
env.reset()  
for _ in range(1000):
    env.render()  
    action = env.action_space.sample() 
    observation, reward, done, info = env.step(action)
    print(observation)
env.close()    

报错:

ValueError Traceback (most recent call last) Cell In [7], line 8 6 action = env.action_space.sample() # 从动作空间中选取一个动作 7 print(action) ----> 8 observation, reward, done, info = env.step(action) # 用于提交动作,括号内是具体的动作 9 print(observation) 10 env.close ValueError: too many values to unpack (expected 4)

实际的返回值应该是5个: print(env.step(action))(array([-0.02054064, -0.15305778, 0.04347553, 0.29886863], dtype=float32), 1.0, False, False, {})

neverevergiveup commented 1 year ago

图片

谢谢!我看的v1.0.4的PDF内没有说明这个问题。

qiwang067 commented 1 year ago

图片

谢谢!我看的v1.0.4的PDF内没有说明这个问题。

嗯嗯~,下回更新 PDF,这个错误修改一下