wuxiyang1996 / iPLAN

iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
https://arxiv.org/abs/2306.06236
MIT License
32 stars 3 forks source link

Parallel computing and some other issues #3

Closed Captain6606 closed 4 months ago

Captain6606 commented 4 months ago

Thanks for your work, I wonder if the code supports parallel computing with two graphics cards, since I have two 4080s. In addition, whether the code supports training to save the model in the Heterogeneous Highway scenario requires a lot of time to train the model. I hope to get your reply. I am a beginner in this field.very grateful!

wuxiyang1996 commented 4 months ago

Thank you for your comments!

  1. As our code is based on pymarl, our code supports parallel computing in general. We never tried on 2 GPUs before, but I think it should be easy to implement if there's any prior work performing parallel computing on the pymarl.
  2. Yes. Our framework allows users to save checkpoints. Try to change the value here: https://github.com/wuxiyang1996/iPLAN/blob/7e4b79a6083fa7800dbfb3e05bdbff981f40bed2/config/default.yaml#L27
  3. Based on my experience, the bottleneck of the training time of our framework is the memory and the capacity of the CPU, as the environment execution takes a lot of time.
Captain6606 commented 4 months ago

Thank you for your reply.

Captain6606 commented 3 months ago

In the envs/env wrappers.py file 154 elif cmd == 'reset': 155 ob = env.reset() 156 state = state_wrapper(env.get_state()) 157 remote.send((state, ob)) I looked at the contents of the variable state,such as: `state array([[ 0. , 1. , 107.05111103, 28. ,

  1. , 0. , 1. , 1. , 118.67782276, 0. , 21.09154345, 0. ,
    1. , 1. , 131.61876518, 12. , 22.71266826, 0. , 3. , 1. , 144.68149668, 24. , 21.66716993, 0. ]]) ` I want to know, is this the initial state of all the vehicles? What does it mean if it's every column? If not, I want to know which variable is in the initial position of the car.Hope to get your reply, thank you.
wuxiyang1996 commented 3 months ago

Please check the initial document for Highway-Env here: https://highway-env.farama.org/observations/

The format of the observations for each column is [Vehicle ID, Presence, Position X, Position Y, Velocity X, Velocity Y].

Captain6606 commented 3 months ago

Thank you for your suggestion,I've been reading the highway-env Documentation.But I still have two questions. This code has two variables called state and ob: https://github.com/wuxiyang1996/iPLAN/blob/7e4b79a6083fa7800dbfb3e05bdbff981f40bed2/envs/env_wrappers.py#L155-L158 1.Does the state variable store the initial information that the vehicle is on the road? 2.If the state stores the initial position of the vehicle, when I change the content of the state in this part of the code, such as the initial speed of the vehicle, how should the variable ob be updated? I look forward to hearing from you,Thank you very much.

wuxiyang1996 commented 3 months ago
  1. The code you shared here is mainly used to reset the state at the beginning of each episode.
  2. The state variable stores the real-time positions and velocities of all on-road vehicles, which comes from the concatenated observation.
  3. You can modify the code here if you want to change the content in the state variable: https://github.com/wuxiyang1996/Heterogeneous_Highway_Env/blob/b96a34754f608bf3e2e9832a35c828bf11782794/envs/highway_env.py#L269
Captain6606 commented 3 months ago

Thank you for your answer. My idea was to change the initial position and speed of the vehicle at the beginning of each episode, but I was never able to find out where the code generated and stored this data. I will try to modify in (https://github.com/wuxiyang1996/Heterogeneous_Highway_Env/blob/b96a34754f608bf3e2e9832a35c828bf11782794/envs/highway_env.py#L269) thank you

Captain6606 commented 3 months ago

I don't know if I made myself clear. It does not change the generation rules of the initial position and speed of the vehicle, but changes the data of the vehicle initialization when the vehicle is initialized on the road. Thank you again and look forward to hearing from you.

wuxiyang1996 commented 3 months ago

You can start by changing the code here: https://github.com/wuxiyang1996/Heterogeneous_Highway_Env/blob/b96a34754f608bf3e2e9832a35c828bf11782794/envs/highway_env.py#L233

Try to replace the create_random things here: https://github.com/wuxiyang1996/Heterogeneous_Highway_Env/blob/b96a34754f608bf3e2e9832a35c828bf11782794/vehicle/kinematics.py#L51

with the fixed initial positions and velocity you want.

Captain6606 commented 3 months ago

I'm sorry, I think you misunderstood me. I've fixed the random numbers and made sure I get the same reward in the same scenario.But I have a new problem: 1.When the initialization of the vehicle is complete, there should be a variable to store the initialization information of the vehicle.Now, I want to find this variable and make some changes to it.This change is not a change to the function that generates the vehicle. Thank you very much for your reply!

wuxiyang1996 commented 3 months ago

Well, for this case, you can try to print out and store the state via the get_state() function each time after the scenario is reset, while each time when you want to use the previous scenarios, you can choose to load the previous initialization with the modified _create_vehicles() function.

However, in the current version of the code, there's no variable that specifically stores the vehicle's initialization information.

Captain6606 commented 3 months ago

yeah,I print env.get_state() here: https://github.com/wuxiyang1996/iPLAN/blob/7e4b79a6083fa7800dbfb3e05bdbff981f40bed2/envs/env_wrappers.py#L155-L158 and when I run it, env.get_state() seems to output the initial information of the vehicle, such as: [[0, 1, 107.05111102678124, 28.0, 25.0, 0.0], [1, 1, 118.67782275741445, 0.0, 21.09154345284106, 0.0]] There are 55 pieces of data in total. When I print the variable ob,It should record the observations of each Agent,such as: `(array([[ 0. , 1. , 1. , 0.28 , 1. ,

  1. ], [ 1. , 1. , 0.11626711, -0.28 , -0.19542283,
  2. ], [ 2. , 1. , 0.24567655, -0.16 , -0.11436658,
  3. ], [ 3. , 1. , 0.37630385, -0.04 , -0.1666415 ,
  4. ], [ 4. , 1. , 0.5108889 , -0.12 , -0.16142389,
  5. ], [ 5. , 1. , 0.63980925, -0.2 , -0.1665415 ,
  6. ], [ 6. , 1. , 0.78124917, -0.2 , -0.09749381,
  7. ], [ 7. , 1. , 0.90969723, -0.28 , -0.13411503,
  8. ], [ 8. , 1. , 1. , 0. , -0.18063286,
  9. ], [ 9. , 1. , 1. , -0.12 , -0.18572257,
  10. ], [10. , 1. , 1. , -0.2 , -0.08419205,
  11. ], [11. , 1. , 1. , -0.08 , 0. ,
  12. ], [12. , 1. , 1. , 0. , -0.13351567,
  13. ], [13. , 1. , 1. , 0. , -0.10553368,
  14. ], [14. , 1. , 1. , -0.2 , -0.14105844,
  15. ]], dtype=float32))`
    Contains five arrays,each array should record the observation data of an Agent. When I try to modify the contents of env.get_state() here, something very annoying happens to me.The data recorded by variable ob is still env.get_state without modifying previous observations.
wuxiyang1996 commented 3 months ago

For the task you are working on, I strongly suggest you focus on recording the initial position, velocity, and agent type from the create vehicle function.

env.get_state() is just a function to print out the real-time state without doing anything new.

Another thing, I feel you may need to copy (or deep copy) the array you want to modify with. What you mentioned seems like a long-lasting issue for Python.