CUDA out of memory issue when using with pretrained weight from COOP

azshue / TPT

Test-time Prompt Tuning (TPT) for zero-shot generalization in vision-language models (NeurIPS 2022))

https://azshue.github.io/TPT/

MIT License

136 stars 16 forks source link

CUDA out of memory issue when using with pretrained weight from COOP #9

Open SeonghaEom opened 1 year ago

SeonghaEom commented 1 year ago

Hi, I was reproducing TPT with loading pretrained weight from COOP .

I realized the current code directly loads pretrained weight which is mapped at gpu index:0. This causes the current code to map at global gpu index 0 which is not what I want.

I think the loading pretrained context should be mapped to 'cpu' and then copy the weight from there.

This will save some memory that was holding up pretrained weight.