Open obadul024 opened 2 years ago
Hi, thanks for the compliment.
Yeah, I know FPS is an issue. With the hardware I had (i9 + RTX 6000), I was able to achieve about 10-15 fps, which is still low (probably due to some poorly optimized code..) Anyway, you could try these:
batch_size
and/or the timesteps
, e.g. here.-windowed -ResX=32 -ResY=32 --quality-level=Low
; see the repo README at the installation section.learning.stage_xyz(...)
you can add the repeat_action
argument. For example you can pick 2, 3, or even 4 (default is 1). This will "duplicate" the model's predictions for 2, 3, or 4 frames, thus reducing the calls to the model. So, you should call learning.stage_xyz(..., repeat_action=2)
, for example. See main, but you can also do this for evaluation: by adding the argument here.shufflenet
backbone. See here.window_size
(it just reduces the pygame window - you can do the same also for evaluation), or you can even reduce the resolution of the input images by setting image_shape
. See here.I guess it's all you can try, hope it helps a bit.
Hi there,
Thank you so much for such a detailed explanation and helpful information and pointers. Literally am beaming right now. This is fantastic help. Thank you once again.
I shall try all of these solutions and let you know if I have any questions.
Thanks for your time and efforts. Appreciate it.
Obaid
On Fri, Apr 29, 2022 at 11:46 AM Luca Anzalone @.***> wrote:
Hi, thanks for the compliment.
Yeah, I know FPS is an issue. With the hardware I had (i9 + RTX 6000), I was able to achieve about 10-15 fps, which is still low (probably due to some poorly optimized code..) Anyway, you could try these:
- If your GPU has enough memory, you can try to run the neural networks on that: just comment this line https://github.com/Luca96/carla-driving-rl-agent/blob/master/main.py#L3. It it fails, try to reduce the batch_size and/or the timesteps, e.g. here https://github.com/Luca96/carla-driving-rl-agent/blob/master/main.py#L38 .
- You can start the CARLA simulator with these flags: -windowed -ResX=32 -ResY=32 --quality-level=Low; see the repo README at the installation section.
- In learning.stage_xyz(...) you can add the repeat_action argument. For example you can pick 2, 3, or even 4 (default is 1). This will "duplicate" the model's predictions for 2, 3, or 4 frames, thus reducing the calls to the model. So, you should call learning.stage_xyz(..., repeat_action=2), for example. See main https://github.com/Luca96/carla-driving-rl-agent/blob/master/main.py, but you can also do this for evaluation: by adding the argument here https://github.com/Luca96/carla-driving-rl-agent/blob/master/core/learning.py#L507-L509 .
- If you're training from scratch, you can opt for a smaller network: try to reduce the number of units, and/or the number of channels in the shufflenet backbone. See here https://github.com/Luca96/carla-driving-rl-agent/blob/master/main.py#L38 .
- Again if training from scratch, you should first try to halve (or more) the window_size (it just reduces the pygame window - you can do the same also for evaluation), or you can even reduce the resolution of the input images by setting image_shape. See here https://github.com/Luca96/carla-driving-rl-agent/blob/master/core/learning.py#L58-L62 .
I guess it's all you can try, hope it helps a bit.
— Reply to this email directly, view it on GitHub https://github.com/Luca96/carla-driving-rl-agent/issues/12#issuecomment-1112973216, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGTT2EQ42DM4SKITBQUGMMTVHOHWDANCNFSM5UOFR6OQ . You are receiving this because you authored the thread.Message ID: @.***>
Hi Luca!
I am running the model in one GPU and carla in another. Both GPU utilization is under 6%, CPU under 60%. I have RTX 2070 and i7. I am getting 1-3 FPS max.
I was wondering if it's possible to run pygame client in another process and process the sensor data parallelly.
Hi, apology for the late response..
In principle it should be possibile but in practice is useless since the agent makes sequential decisions: it has to first wait for the sensor data, which are then fed to the neural nets, that finally outputs the action for time t
.
I have to check the code and see if it's possible to optimize the neural nets (e.g. use more @tf.function
), and look for eventual bottlenecks that prevent full GPU utilization (probably due to data transfer between CPU and GPU, related to the memory buffer).
The fact is that RL is mainly sequential (at most you can have a bunch of environments in parallel): you run your environment/simulator for N
steps (but at each step you use your NN to predict on a batch of size 1
- the state - and so here you can't fully leverage vectorization and data parallelism), then the N
experience tuples are stored into a memory buffer, from which you later retrieve a batch of B
examples to train the NN to improve. Basically, all of this repeats until convergence or when the maximum amount of steps is exceeded.
Hi there, I cloned the your fantastic repo and started to run some experiments. There is an issue I am facing however, the FPS is stuck at 2. No matter what I tried it simply cannot run any faster. I tried it on an evaluation and training experiment.
I can manage to run Carla as a server at 60 FPS no issues. But when I run the main script, it just simply doesn't work.
I would love to have some pointers. Thanks for your help. Cheers