Hon-Wong / PTSEFormer

[ECCV 2022] PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
https://arxiv.org/abs/2209.02242
MIT License
34 stars 7 forks source link

The model's usage of graphics memory and errors in the original code runtime #5

Closed Angknpng closed 1 year ago

Angknpng commented 2 years ago

Hi! Excited to see this work, I have questions to ask.

I would like to ask how much memory you use to train the model? I failed to train a model using two reference frames on 4 RTX3090 graphics cards, only when I use 1 frame of reference frames, the memory usage per graphics card is 17g.

In addition, when I first trained the model in this paper, there was a tensor dimension error, so I modified 'activation.py' to adjust the tensor dimension. I don't know if this is the reason for the abnormal graphics memory.

sincerely!

Hon-Wong commented 1 year ago

Hi,

I use 8 tesla v100 gpus to train this model, each with a memory of 32G.