Open fangyichen123 opened 4 months ago
Hello, Do you perform your training on a GPU or a CPU?
because on gpu, it should be fast.
also you can try parallel training, this simulation is suitable for that purpose.
Thank you very much for your reply! I have already used 4090 to run 16 environments in parallel. In fact, I am more concerned about the communication speed between the client and the server, which I think is the main time-consuming part. For example, operations such as "df.get_plane_state()" or "df.set_Plane_yaw()" take a long time to use network mode. Also I found that in Ubuntu: Getting the environment state (such as "df.get_plane_state()") and setting the action (such as "df.set_Plane_yaw()") seem to affect each other's communication speed.
Hello, if you want, you can prevent a lot of information from being sent during communication by managing the state. This might help to improve speed. However, such delays should not occur in communication.
Can you try with and without this function.
I haven't tried changing this function, I've always set it like this: df.set_client_update_mode(True) Do you mean I need to set it to False?
Now I have tried this function,I set this: df.set_client_update_mode(False) I realized one thing, is this the asynchronous mode mentioned in this paper?
yes that is exactly async mode. when you send states, on the background it continue to run simulation. but if you activate it, it waits action information to take step. Probably your neural network is heavy, when you send state, it takes time to process that. i have tried with TD3 and SAC in comparison of speed, TD3 was better. SAC was really too slow. you can try it. As a conclusion, speed of simulation is really good, check your neural network side.
Thank you very much for your advice. I will check my project again. I am studying at the University of Science and Technology of China(USTC). We think Harfang is a very good environment. I hope there will be more opportunities for communication in the future. Finally, please let me make a confirmation. Is the simulation speed you mentioned also included in Ubuntu?
You welcome. Yes we made our all tests in ubuntu.
@fangyichen123
The need for an async mode could be solved in several ways. The network mode is the most obvious technique that came to mind when we first tried to implement the Sandbox as a server, but we could consider to use Python asyncio module that would run the simulation independently from the main thread ... Or any other way to implement an inter process communication between the Python interpreter that runs the Sandbox and the Python that run the AI toolchain (PyTorch, etc).
Excuse me, Is there any method that does not require network mode training? Because I think the network communication time may affect the execution speed of each step in RL and thus affect the training time.
I'm training MARL now and each step takes about 100-200ms (I have already set this: df.set_renderless_mode(True)), is this speed normal? I think the training speed is too slow