멀티 GPU를 사용하려면 파라미터가 따로 있나요 ?

khanrc / honeybee

Official implementation of project Honeybee (CVPR 2024)

Other

428 stars 19 forks source link

멀티 GPU를 사용하려면 파라미터가 따로 있나요 ? #14

Closed SongDongKuk closed 7 months ago

SongDongKuk commented 8 months ago

nn.DataParallel을 사용해도 model이 특정 GPU에만 할당이 되서 CUDA out of memory를 내뱉고 있습니다 ㅠㅠ

방법이 있는지 확인좀 부탁드릴게요 !

JonghwanMun commented 8 months ago

We recommend to use nn.parallel.DistributedDataParallel which more efficiently manges memory. (see link)

khanrc commented 7 months ago

The evaluation code and script we provided are designed to utilize all available GPUs. Please ensure that you have correctly used the evaluation script (especially torchrun --nproc_per_node=auto part) as your first step. If the issue persists with this script, consider manually setting the --nproc_per_node to match the number of GPUs available.