Tsinghua-MARS-Lab / M2I

M2I is a simple but effective joint motion prediction framework through marginal and conditional predictions by exploiting the factorized relations between interacting agents.
https://tsinghua-mars-lab.github.io/M2I/
MIT License
187 stars 24 forks source link

CUDA out of memory error #10

Closed Karami-m closed 1 year ago

Karami-m commented 1 year ago

Hi,

Thanks for making the code available for such an interesting work.

I have tried to train the Relation Prediction model on GPUs with 32 GB of memory but it lead to CUDA out of memory error. I have also tried to train with vgg16(pretrain=True) but still ran into the same problem. So, I wonder what kind of GPU you used for your experiments and how you manage the memory in training.

Karami-m commented 1 year ago

The problem was solved by adding the following to the run.py file:

import tensorflow as tf  
gpus = tf.config.experimental.list_physical_devices('GPU')  
for gpu in gpus:  
    tf.config.experimental.set_memory_growth(gpu, True)
larksq commented 1 year ago

We used NVIDIA 3080 or sometimes the NVIDIA A10 for training. Usually changing the train_batch_size to a smaller value helps with the out-of-memory error problem. I am glad you found a solution that worked for you, hence your error might be related to a different version of TensorFlow.