Regarding the change from the gym environment to the updated Gymnasium environment in Ray 3.0

paehal commented 1 year ago

Hello everyone,

I would like to inquire about the change from the gym module to Gymnasium in the pybullet drone environment for reinforcement learning. The OpenAI gym environment has been discontinued and taken over by Gymnasium. In the nightly version of the Ray module, which is used for reinforcement learning in gym pybullet drone, the change from gym to Gymnasium is also applied.

This time, I wanted to use the nightly ver3.0 of the Ray library, so I installed it and tried to execute reinforcement learning, but an error occurred. The error was 'TypeError: reset() got an unexpected keyword argument 'seed'. In Ray 3.0, the definition of the reset function and others are different, and it seems that I can not apply gym pybullet drone as it is.

I tried to make changes to the reset function, which is described in BaseAviary.py, but the same error occurred.

Is a complex change necessary for this ray update?

By the way, Gymnasium seems to include old gym environment wrappers like this, but I do not know where to apply gym pybullet drone. https://gymnasium.farama.org/content/gym_compatibility/#step-api-compatibility

If anyone knows any solution, please let me know.

JacopoPan commented 1 year ago

Hello @paehal,

all the environments in this package extend gym.Env and, only those who extend gym-pybullet-drones class BaseMultiagentAviary also extend MultiAgentEnv from ray.rllib.env.multi_agent_env.

If you can show me a minimum non-working example to reproduce the error you are dealing with, I'm happy to revise whether the inheritance is warranted or cosmetic in that case and suggest changes (or update this repo). 🙏

sanoufar commented 1 year ago

@ JacopoPan

how to run this in the terminal? for Multiagent.py

$ python multiagent.py --num_drones --env --obs --act --algo --num_workers

I am getting this error

pybullet build time: May 20 2022 19:44:17 [INFO] BaseAviary.init() loaded parameters from the drone's .urdf: [INFO] m 0.027000, L 0.039700, [INFO] ixx 0.000014, iyy 0.000014, izz 0.000022, [INFO] kf 0.000000, km 0.000000, [INFO] t2w 2.250000, max_speed_kmh 30.000000, [INFO] gnd_eff_coeff 11.368590, prop_radius 0.023135, [INFO] drag_xy_coeff 0.000001, drag_z_coeff 0.000001, [INFO] dw_coeff_1 2267.180000, dw_coeff_2 0.160000, dw_coeff_3 -0.110000 /home/blessy/miniconda3/envs/drones/lib/python3.8/site-packages/gym/spaces/box.py:73: UserWarning: WARN: Box bound precision lowered by casting to float32 logger.warn( 2023-02-24 03:42:20,078 INFO ppo.py:166 -- In multi-agent mode, policies will be optimized sequentially by the multi-GPU optimizer. Consider setting simple_optimizer=True if this doesn't work for you. 2023-02-24 03:42:20,078 INFO trainer.py:743 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags. [INFO] BaseAviary.init() loaded parameters from the drone's .urdf: [INFO] m 0.027000, L 0.039700, [INFO] ixx 0.000014, iyy 0.000014, izz 0.000022, [INFO] kf 0.000000, km 0.000000, [INFO] t2w 2.250000, max_speed_kmh 30.000000, [INFO] gnd_eff_coeff 11.368590, prop_radius 0.023135, [INFO] drag_xy_coeff 0.000001, drag_z_coeff 0.000001, [INFO] dw_coeff_1 2267.180000, dw_coeff_2 0.160000, dw_coeff_3 -0.110000 2023-02-24 03:42:20,164 WARNING util.py:57 -- Install gputil for GPU system monitoring.

Traceback (most recent call last):

File "test_multiagent.py", line 254, in with open(ARGS.exp+'/checkpoint.txt', 'r+') as f:

FileNotFoundError: [Errno 2] No such file or directory: './results/save-leaderfollower-2-cc-kin-one_d_rpm-02.24.2023_02.08.42/checkpoint.txt'

JacopoPan commented 1 year ago

Hi @sanoufar

are you having a problem with multiagent.py or test_multiagent.py? It's not super clear from the message above. The latter of course requires to first to have run and saved data.

sanoufar commented 1 year ago

Hi @sanoufar

are you having a problem with multiagent.py or test_multiagent.py? It's not super clear from the message above. The latter of course requires to first to have run and saved data.

Hello @JacopoPan I Highly appreciate your prompt response above. I am facing problem with multiagent.py as well as 'test_multiagent.py' . kindly guide me to run the files in experiments folder inlcuding multiagent.py and singleagent.py.

JacopoPan commented 1 year ago

@sanoufar

The prerequisite to run test_multiagent.py is to have run multiagent.py at least to a point where it saved training data in folder gym-pybullet-drones/experiments/learning/results/save-<env>-<num_drones>-<algo>-<obs>-<act>-<date>.

You can do this by running

gym-pybullet-drones/experiments/learning/
python3 multiagent.py

Then

python3 test_multiagent.py --exp ./results/save-<env>-<num_drones>-<algo>-<obs>-<act>-<date>

(of course, you'll need to replace save-<env>-<num_drones>-<algo>-<obs>-<act>-<date> with the actual folder name but please check it's content as well, if it is empty, multiagent.py did not run successfully or long enough and test_multiagent.py won't work).

paehal commented 1 year ago

@JacopoPan

Sorry for the delay in replying.

Since ver 2.3 of ray was released a few days ago and Gymnasium has been applied there, I think you will get an error if you install ray with ver 2.3 and run the following tutorial code.

$ cd gym-pybullet-drones/experiments/learning/ $ python3 multiagent.py --num_drones --env --obs --act --algo --num_workers >

paehal commented 1 year ago

I closed it by mistake. It will be reopened.

JacopoPan commented 1 year ago

Hi @paehal

the Ray version installed along this package should be 1.9. Is there a reason to want to install 2.3? If you want to use the latest ray but also this package, I would simply do it in 2 separate conda environments.

Then if you can show me what the error is and it is possible/useful to resolve it, I'm happy to do so.

paehal commented 1 year ago

@JacopoPan

Thank you for your response! I am trying to control a drone using reinforcement learning, and when attempting to use PPO for learning, the training stops due to an error on the Ray side.

The error message is "Missing 'grad_gnorm' key in some input_trees after some training time."

As discussed in this topic, there is a bug on the Ray side that has been fixed in version 2.3. https://discuss.ray.io/t/missing-grad-gnorm-key-in-some-input-trees-after-some-training-time/8553

This is why I want to install version 2.3.

JacopoPan commented 1 year ago

I see @paehal

but then (if you are interested in single quadrotor control and PPO) wouldn’t it be better to work from the single agent example using stable-baselines3 (or adapting that script to Ray as in gym_pybullet_drones/examples/learn.py)?

JacopoPan commented 1 year ago

Also, if you fork the repo and make a PR modifying singleagent.py, we can collaboratively work on that file here on GitHub.

paehal commented 1 year ago

@JacopoPan

Sorry, I forgot to tell you something important. I am trying to do reinforcement learning for multiagent and would like to use ray for that.

I have heard that the transition from ray 2.2 to 2.3 is not that difficult, but I am having trouble because I am not sure how to fix it due to my lack of knowledge. https://gymnasium.farama.org/api/wrappers/misc_wrappers/#gymnasium.wrappers.StepAPICompatibility

JacopoPan commented 1 year ago

Can you then report on the packages in your condo environment (conda list) and the full terminal output of multiagent.py on your system?

paehal commented 1 year ago

@JacopoPan

Are you referring to the error details when using ray ver 2.3? error message is as follows. I think this error is a result of changing the version of ray (= changing fron gym to gymnasium.)

_ray/rllib/env/multi_agent_env.py", line 911, in reset self.last_obs, self.last_infos = obs_andinfos ValueError: too many values to unpack (expected 2)

I don't use conda, so I list the modules related.

ray[rllib]==2.3 gymnasium==0.26.3 pybullet==3.2.5

JacopoPan commented 1 year ago

Ok,

but that tells me that there is a mismatched sized for obs_and_infos in Ray's code (line 911 of multi_agent_env.py), the reason is most likely the fact that the original gym interface (and thus BaseAviary in this repo) only returns an obs on reset() while multi_agent_env.py returns both an obs and an info dictionary.

Do you know what the new gymnasium includes in the dictionary originally returned on reset? Otherwise you might just work around the problem adding an empty one.

paehal commented 1 year ago

@JacopoPan

I do not know how the specific return values of functions will change with the change from 'gym' to 'gymnasium'. As previously mentioned, discussions on this topic have been taking place in the following thread, such as the need for a seed argument in the reset() function, and the addition of new return values.

https://discuss.ray.io/t/missing-grad-gnorm-key-in-some-input-trees-after-some-training-time/8553/16

Additionally, as previously mentioned, there is a wrapper available that converts an old environment to the new environment. Will this solution be helpful for you?

https://gymnasium.farama.org/content/gym_compatibility/#step-api-compatibility

_Step API Compatibility If environments implement the (old) done step API, Gymnasium provides both functions (gymnasium.utils.step_api_compatibility.convert_to_terminated_truncated_step_api()) and wrappers (gymnasium.wrappers.StepAPICompatibility) that will convert an environment with the old step API (using done) to the new step API (using termination and truncation)._

JacopoPan commented 1 year ago

The issue is summarized in this API change, this repo implements v21, not v26. I will create a new branch and PR for this change but I can't dedicate much time to it in the short term, unfortunately.

paehal commented 1 year ago

@JacopoPan

Thank you for compiling this information.

It certainly seems to depend on this change for the errors that are appearing now.

It would be very helpful if you could apply this change.

By the way, can you imagine what specific files might be affected by this change? I will try to make some changes if I can.

JacopoPan commented 1 year ago

@paehal see https://github.com/utiasDSL/gym-pybullet-drones/pull/135#issuecomment-1457587316

utiasDSL / gym-pybullet-drones

Regarding the change from the gym environment to the updated Gymnasium environment in Ray 3.0 #133