Question about GPU memory usage

@jkewang Sorry for the late reply. Actually, we won't set a limit for n_token, it basically depends on the complexity of the scenarios. Of course, you can set an upper bound and remove some elements that are not important based on some handcrafted rules. In my personal experiments, n_token varies from 100-250.

It indeed cost more GPU memory compared with the previous method since we modeled the all-to-all relationship between the instances in the RPE calculation. For Argo2, our experiment setup is 3090 (24G) x 8, batch_size=8 on each GPU, around 11 hours for 50 epochs, so I think (maybe) it is kind of "affordable" at present :). The good side for SIMPL (also for other methods based on relative scene encoding, such as QCNet, GoRela, etc.) is we only need to infer one time for all surrounding agents, which is different from those target-centric methods that need to conduct "normalize then predict" multiple times, and achieve good prediction accuracy as well. In practice, for agent-centric methods, we usually preprocess the surrounding context for each target, then send them to the network in a batch, which also costs a lot of memory. Besides, as you previously mentioned, we can set an upper bound for n_token, so the total memory usage will be bounded. I would personally think this is not a big issue, especially in the inference stage.

By the way, sorry for the Argo2 code, since I'm currently busy with some personal matters. I will try to clean the code and release it in the near future, hopefully within this month.

HKUST-Aerial-Robotics / SIMPL

Question about GPU memory usage #3