HKUST-Aerial-Robotics / SIMPL

SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving
MIT License
215 stars 28 forks source link

Question about GPU memory usage #3

Closed jkewang closed 7 months ago

jkewang commented 8 months ago

First of all, congratulations on the excellent method you've proposed!

I'm curious about one detail, for the Argoverse2 dataset, what is the maximum number of 'n_token‘ that can be in the sft_layer? Also, does the repeat process cause excessive GPU memory usage? Was there any special consideration given to this aspect when designing the method?

MasterIzumi commented 8 months ago

@jkewang Sorry for the late reply. Actually, we won't set a limit for n_token, it basically depends on the complexity of the scenarios. Of course, you can set an upper bound and remove some elements that are not important based on some handcrafted rules. In my personal experiments, n_token varies from 100-250.

It indeed cost more GPU memory compared with the previous method since we modeled the all-to-all relationship between the instances in the RPE calculation. For Argo2, our experiment setup is 3090 (24G) x 8, batch_size=8 on each GPU, around 11 hours for 50 epochs, so I think (maybe) it is kind of "affordable" at present :). The good side for SIMPL (also for other methods based on relative scene encoding, such as QCNet, GoRela, etc.) is we only need to infer one time for all surrounding agents, which is different from those target-centric methods that need to conduct "normalize then predict" multiple times, and achieve good prediction accuracy as well. In practice, for agent-centric methods, we usually preprocess the surrounding context for each target, then send them to the network in a batch, which also costs a lot of memory. Besides, as you previously mentioned, we can set an upper bound for n_token, so the total memory usage will be bounded. I would personally think this is not a big issue, especially in the inference stage.

By the way, sorry for the Argo2 code, since I'm currently busy with some personal matters. I will try to clean the code and release it in the near future, hopefully within this month.