[Question] agent got stucked always in the action space lower bound

hkuribayashi commented 3 years ago

Important Note: We do not do technical support, nor consulting and don't answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.

`Question`

Hi everyone,

When I use the following space actions configurations, i sample vector action with different values. But I created a custom gym Environment with the same action space configuration. I set up the baselines3 by using A2C or PPO, but for all sampled actions it seems the agent got stucked always in the action space lower bound (5.0). I was expecting something like:

21.08086 20.020802 16.812733 23.77745 10.687413 20.424904 15.4278145 26.068079 18.092493 22.096527 ] [ 5.002933 8.210208 15.631343 5.3958955 29.201706 27.193197 21.82524 25.94392 33.925514 30.831163 ]

What am I doing wrong?

`import numpy as np from gym import spaces

low_actions = [] highactions = [] for in range(10): low_actions.append(5.0) high_actions.append(35.0)

action_space = spaces.Box(low=np.array(low_actions), high=np.array(highactions)) # steer, gas, brake for in range(1000): print(action_space.sample())`

Additional `context`

Add any other context about the question here.

Checklist

[x] I have read the documentation (required)
[x] I have checked that there is no similar issue in the repo (required)

Miffyli commented 3 years ago

Hey. Please fill in all the issue template, i.e. provide a minimal code to reproduce the bug.

Sounds like the issue is related to a custom env and unexpected training results, which is more of a tech support thing. For those I recommend checking the links in beginning of issue template.

araffin commented 3 years ago

i highly suspect your issue is the one mentioned in our tips and tricks: https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html#tips-and-tricks-when-creating-a-custom-environment

please fill the custom env issue template next time and use the env checker...

hkuribayashi commented 3 years ago

Hey. Please fill in all the issue template, i.e. provide a minimal code to reproduce the bug.

Sounds like the issue is related to a custom env and unexpected training results, which is more of a tech support thing. For those I recommend checking the links in beginning of issue template.

Sorry @Miffyli. Should I close this issue? I mean, I was not my intention to cause such trouble.

hkuribayashi commented 3 years ago

i highly suspect your issue is the one mentioned in our tips and tricks: https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html#tips-and-tricks-when-creating-a-custom-environment

please fill the custom env issue template next time and use the env checker...

@araffin Thank you very much once more. Yor're a life saver. However, may I ask a complementary question? By considering the tip, even for discrete observation space states (using stablelines3 DQN), Should I normalize the observation state space? If yes, something link [0,1] or [-1,1].

araffin commented 3 years ago

By considering the tip, even for discrete observation space states (using stablelines3 DQN), Should I normalize the observation state space? If yes, something link [0,1] or [-1,1].

I think you may be confusing action and observation space. But yes, for observation spaces, as mentioned in the doc, it is always a good practice to normalize it ([-1, 1], [0, 1] should not really matter).

Should I close this issue?

if your issue is solved, yes.

DLR-RM / stable-baselines3