VICO-UoE / URL

Universal Representation Learning from Multiple Domains for Few-shot Classification - ICCV 2021, Cross-domain Few-shot Learning with Task-specific Adapters - CVPR 2022
MIT License
126 stars 17 forks source link

Inquiry about running on multiple GPUs #16

Open kdc12345 opened 1 year ago

kdc12345 commented 1 year ago

Hello,

I am currently using your project and am wondering if it is possible to run it on multiple GPUs. Specifically, I am interested in training the model on two GPUs to accelerate the training process.

I have tried to modify the code to support multiple GPUs, but I encountered some errors. Could you please let me know if your project supports multi-GPU training? If so, could you provide some guidance on how to implement it correctly?

Thank you for your help in advance!

Best regards!

WeiHongLee commented 1 year ago

Hi,

Thanks for the question.

We run our code on a single GPU. I think it might be possible for run the model on multiple GPUs but will need some efforts on modifying the code. Which project are you going to run on multiple GPUs? The URL or TSA?

I would recommend to update the code for the network architecture and forward function for enabling the code to run on multiple GPUs.

Best!

kdc12345 commented 1 year ago

Hi,

Thanks for the quick reply.

I am going to run the URL project with multiple GPUs. It always takes too much time for URL. How long did it take you to complete the URL?

Thank you.

WeiHongLee commented 1 year ago

Hi,

I see. I think it is possible to enable the URL training to run on multiple GPUs with the modification on the network script (resnet) and other related code. Also, the BatchNorm synchronization can be tricky for multi-domain learning over multiple GPUs.

We have spent 48 hours in total for training URL over 8 domains on a single V100 GPU but with early stopping, it will need much less.

Best!