Multi Agent Training - Githubissues

sAz-G commented 9 months ago

I would like to conduct training with multiple drones using my own custom implementation ( observation, policy, action space..). Do you have example scripts for modifying/extending your environment?

Zhehui-Huang commented 7 months ago

Sorry, we do not have a script to modify the observations, policy, actions. If you want to modify observations, you can implement your own function to get self state in get_state.py. If you want to change neighbor observations, you can change add_neighborhood_obs() in quadrotor_multi.py, if you want to change obstacle observations, you can implement your own functions in obstacles/utils.py. If you want to design your own policies, please check models/quad_multi_model.py. If you want to change action space, please check quadrotor_single.py

sAz-G commented 6 months ago

First I would like to change the observation of the neighbor. I would like to add the angular velocity of each neighbor to the observation, which means changing the observation vector of each neighbor from (p,v) to (p,v,w).

I reviewed the function add_neighborhood_obs, is it enough to concatenate the observation there? Or should I also change the the function extend_obs_space.

Do I also have to change the function make_observation_space and the dict QUADS_OBS_REPR?

Does your simulator support position setpoints (for using position as action instead of thrust)?

Thanks for your help

Zhehui-Huang commented 6 months ago

I reviewed the function add_neighborhood_obs, is it enough to concatenate the observation there? Or should I also change the the function extend_obs_space.

Both works.

Do I also have to change the function make_observation_space and the dict QUADS_OBS_REPR?

No.

Does your simulator support position setpoints (for using position as action instead of thrust)? Check this: https://github.com/amolchanov86/gym_art/blob/master/gym_art/quadrotor/quadrotor_control.py

sAz-G commented 6 months ago

No.

I tried extending the neighbor observation space by changing only the function add_neighborhood_obs without further modifications as you suggest, however I get the following error:

 File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/sampling/rollout_worker.py", line 162, in init
    self._maybe_send_policy_request(r)
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/sampling/rollout_worker.py", line 190, in _maybe_send_policy_request
    policy_request = runner.generate_policy_request()
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 688, in generate_policy_request
    self._prepare_next_step()
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 610, in _prepare_next_step
    actor_state.set_trajectory_data(policy_inputs, self.rollout_step)
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 134, in set_trajectory_data
    self.curr_traj_buffer[rollout_step] = data
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/utils/tensor_dict.py", line 44, in __setitem__
    self._set_data_func(self, key, value)
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/utils/tensor_dict.py", line 49, in _set_data_func
    self._set_data_func(x.get(new_data_key), index, new_data_value)
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/utils/tensor_dict.py", line 49, in _set_data_func
    self._set_data_func(x.get(new_data_key), index, new_data_value)
  File "/home/saz/anaconda3/envs/swarm-rl/lib/python3.8/site-packages/sample_factory/algo/utils/tensor_dict.py", line 69, in _set_data_func
    x[index] = n
ValueError: could not broadcast input array from shape (57,) into shape (54,)

I am adding 3 values to each observation in obs_ext. Maybe I do have to change the definition of the observation space in here. If so, do I have to change the code also somewhere else?

I get a similar error when I try to extend the self observation

Zhehui-Huang commented 6 months ago

Sorry for the confusing. If you change your observations, you need to change make_observation_space. You can mimic what I did in this function.

If so, do I have to change the code also somewhere else?

Yes, you need to change a few places. The basic logic is that when you modify observations, you need to change observation space, the way you get observations, model architecture, the way you feed observations into the model, etc. To figure out more details, you can check how I deal with neighbor observations.

sAz-G commented 6 months ago

I could change the neighbor observation and run a training.

I added my observation code to add_neighborhood_obs. Then, I had to change make_observation_space. In addition, I had to add an entry in QUADS_NEIGHBOR_OBS_TYPE, and to change this entry to accept the type I defined

Zhehui-Huang commented 6 months ago

Cool. I think it is the time to close this thread.

Zhehui-Huang / quad-swarm-rl

Multi Agent Training #45