aimagelab / VKD

PyTorch code for ECCV 2020 paper: "Robust Re-Identification by Multiple Views Knowledge Distillation"
MIT License
73 stars 15 forks source link

GPUs required? #6

Closed sopsos closed 4 years ago

sopsos commented 4 years ago

What GPUs were used to train this? I would like to know what is the minimum recommended setup.

I am running this with a GeForce RTX 2080 Ti and, following the instructions, when I run python ./tools/train_v2v.py mars --backbone resnet50 --num_train_images 8 --p 8 --k 4 --exp_name base_mars_resnet50 --first_milestone 100 --step_milestone 100

I get the following error: RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 10.76 GiB total capacity; 9.69 GiB already allocated; 29.75 MiB free; 197.14 MiB cached)

I have nothing else running and my GPU is 100% idle.

angpo commented 4 years ago

Hi,

We train the teacher network on two GeForce GTX 1080 Ti; for the training of the student, a single GPU ought to be enough. Please note that we released all the checkpoints; so, you might use our pre-trained teachers and just run the training of the student.

A.

angpo commented 4 years ago

Alternative, you can still train ResNet50 from scratch on a single GPU, but you have to decrease the number of frames (--num_training_images) as well as the number of different identities (--p) within the input batch:

python ./tools/train_v2v.py mars --backbone resnet50 --num_train_images 6 --p 6 --k 4 --exp_name base_mars_resnet50 --first_milestone 100 --step_milestone 100

I've just verified, it fits on a single GeForce GTX 1080 Ti.

A.

sopsos commented 4 years ago

Thank you. The training command you mentioned with smaller batch size works.

Slightly related with this, how much RAM did your computer have for this? The training works ok, but using your pre-trained models this evaluation command fills my entire RAM (16GB):

python ./tools/eval.py mars ./logs/baseline_public/mars/base_mars_resnet50 --trinet_chk_name chk_end

lucabergamini commented 4 years ago

Slightly related with this, how much RAM did your computer have for this?

Our local machine had 64GB at the time