Closed behradkhadem closed 1 year ago
Hi @behradkhadem ,
good to see that you are interested in this project!
First comment for RL usage with these environments. We currently have no default reward function implemented, so you would need to implement this yourself, see https://github.com/maxspahn/gym_envs_urdf/blob/be7532ae35675c5a2fd8c0d1782e8dbfd684e446/urdfenvs/urdf_common/urdf_env.py#L278. Maybe, some of the users with RL can give you some hint on that @alxschwrz .
How can I circumvent the dictionary definition of observation and action spaces?
In the gym.make
-call, you can specify to flatten the observation, using the flatten_observation
-argument. For your case it would look like:
robots = [
GenericUrdfReacher(urdf="pointRobot.urdf", mode="vel"),
]
env = gym.make(
"urdf-env-v0",
dt=0.01, robots=robots, render=True, flatten_observation=True,
)
Then, the observation is flattened into an array.
Let me know if you need anything else and good luck.
Thanks for your kind response @maxspahn
I did a quick test and didn't dive deep to the problem, but it seems that it didn't work the way it should, and I got the same error.
I even tried to use FlattenObservation
wrapper but didn't work either (more info here).
And on the problem of static reward, it there a way to pass our reward function or we have no other way other than overriding existing code?
I did a quick test and didn't dive deep to the problem, but it seems that it didn't work the way it should, and I got the same error.
Could you provide me with the script you are trying to run? Then, I could have a look.
And on the problem of static reward, it there a way to pass our reward function or we have no other way other than overriding existing code?
Currently this is not possible, so you would have to write your own environment. I recommend simply deriving from UrdfEnvs and only overloading the stepper function.
I did a quick test and didn't dive deep to the problem, but it seems that it didn't work the way it should, and I got the same error.
Could you provide me with the script you are trying to run? Then, I could have a look.
And on the problem of static reward, it there a way to pass our reward function or we have no other way other than overriding existing code?
Currently this is not possible, so you would have to write your own environment. I recommend simply deriving from UrdfEnvs and only overloading the stepper function.
Sure, it was in a notebook environment, but it was basically this:
import warnings
import gym
import numpy as np
from urdfenvs.urdf_common.urdf_env import UrdfEnv
from urdfenvs.robots.generic_urdf import GenericUrdfReacher
from stable_baselines3 import TD3
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
robots = [
GenericUrdfReacher(urdf="pointRobot.urdf", mode="vel"),
]
env = gym.make(
"urdf-env-v0",
dt=0.01, robots=robots, render=True, flatten_observation=True
)
env.reset()
# keys = ['observation', 'desired_goal']
# env = FlattenObservation(FilterObservation(env, keys))
# Define the TD3 agent and train it on the environment
model = TD3("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=100000)
I was just trying to run the code successfully (meaning without error) and wasn't expecting a result from RL algorithm. Thanks for your time, I really appreciate it.
Ok, so I found the bug. Actually only the observation is flattened with this approach. Not the observation_space
(and neither the action_space
).
I created a new issue for that, see #171.
Feel free to create a PR for that. I might have time end of next week myself.
Hi everybody, sorry for my late reply. I currently don't have access to a computer, but I am happy to give you more information about how I use gym_envs_urdf for RL by the end of this week @behradkhadem. Do you have any specific questions at the moment? For the reward function, I overwrote the step functions as Max mentioned.
Hi @alxschwrz, The issue above is about me being unable to use Gym envs of this package for training a RL agent (using stable baselines 3). I get an error regarding data type of observation and action spaces and can't run a simple code (like the one above). How did you tackle this issue? Can you put some sample code of how you've used it?
So, I have checked whether the flatten_observation
still work. And thanks to your (@behradkhadem) hint to the FlattenObservation wrapper, I realized that the flatten_observation
is redundant with the wrapper.
If you don't use the argument flatten_observation
and use the FlattenObservation wrapper it works simply using
env = FlattenObservation(env)
Note that you have to do that after the reset
Let me know if that helps you. I'll work on the integrating the FullSensor
so that the observation also contains information on the goal and obstacles.
So, I have checked whether the
flatten_observation
still work. And thanks to your (@behradkhadem) hint to the FlattenObservation wrapper, I realized that theflatten_observation
is redundant with the wrapper.If you don't use the argument
flatten_observation
and use the FlattenObservation wrapper it works simply usingenv = FlattenObservation(env)
Note that you have to do that after the reset
Let me know if that helps you. I'll work on the integrating the
FullSensor
so that the observation also contains information on the goal and obstacles.
Thanks dear @maxspahn but this didn't work for me. I used FlattenObservation
wrappper after reset method but I got this error. I thought this was due to package version so I tried pip install --upgrade urdfenvs
but nothing changed.
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
/tmp/ipykernel_428/2987627242.py in
7 )
8 env.reset()
----> 9 env = FlattenObservation(env=env)
10 # keys = ['observation', 'desired_goal']
11 # env = FlattenObservation(FilterObservation(env, keys))
[~/anaconda3/envs/SB3/lib/python3.9/site-packages/gym/wrappers/flatten_observation.py](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/behradx/projects/RL/SB3/Robot/~/anaconda3/envs/SB3/lib/python3.9/site-packages/gym/wrappers/flatten_observation.py) in __init__(self, env)
8 def __init__(self, env):
9 super(FlattenObservation, self).__init__(env)
---> 10 self.observation_space = spaces.flatten_space(env.observation_space)
11
12 def observation(self, observation):
[~/anaconda3/envs/SB3/lib/python3.9/functools.py](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/behradx/projects/RL/SB3/Robot/~/anaconda3/envs/SB3/lib/python3.9/functools.py) in wrapper(*args, **kw)
886 '1 positional argument')
887
--> 888 return dispatch(args[0].__class__)(*args, **kw)
889
890 funcname = getattr(func, '__name__', 'singledispatch function')
[~/anaconda3/envs/SB3/lib/python3.9/site-packages/gym/spaces/utils.py](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/behradx/projects/RL/SB3/Robot/~/anaconda3/envs/SB3/lib/python3.9/site-packages/gym/spaces/utils.py) in flatten_space(space)
190 True
...
--> 192 raise NotImplementedError(f"Unknown space: `{space}`")
193
194
NotImplementedError: Unknown space: `{'robot_0': Dict(joint_state:Dict(position:Box([-5. -5. -5.], [5. 5. 5.], (3,), float64), velocity:Box([-2.175 -2.175 -2.175], [2.175 2.175 2.175], (3,), float64)))}`
And here is the code I ran:
import warnings
import gym
import numpy as np
from urdfenvs.urdf_common.urdf_env import UrdfEnv
from urdfenvs.robots.generic_urdf import GenericUrdfReacher
robots = [
GenericUrdfReacher(urdf="pointRobot.urdf", mode="vel"),
]
env = gym.make(
"urdf-env-v0",
dt=0.01, robots=robots, render=True, flatten_observation=True # I tried both true and false.
)
env.reset()
env = FlattenObservation(env)
@behradkhadem I have created a PR to improve the situation for you.
Let me know if that helps your case by checking out the corresponding branch of the PR. I'll wait a bit for your response on that.
Which versions of urdfenvs are you using by the way?
@behradkhadem I have created a PR to improve the situation for you.
Let me know if that helps your case by checking out the corresponding branch of the PR. I'll wait a bit for your response on that.
Which versions of urdfenvs are you using by the way?
Thanks a lot! I've never tested a python package from a branch to be honest and I'm searching on the subject and will respond if my tests were successful. And I'm using version urdfenvs==0.6.0
.
Thanks a lot! I've never tested a python package from a branch to be honest and I'm searching on the subject
You could install from a specific branch using: `
pip install git+ssh://git@github.com/maxspahn/gym_envs_urdf.git@fix-flatten-observation
Or, you clone the repository and install it using pip install .
.
Since I got no response from @alxschwrz, I'm closing this issue.
Hello everyone!
I'm trying to use the environments of this package for RL robotics tasks (using stable baselines3 package in python). I define my
env
as mentioned in docs:But environment's observation and action spaces are defined as a dictionary, so in order to access them I should do something like this:
So, when I define my RL model like this:
I get this error:
How can I circumvent the dictionary definition of observation and action spaces?
I run my code on Windows 11 with WSL2 (Ubuntu) on Anaconda.