Open maliozer opened 1 year ago
Unfortunately, this seems impossible. I'm trying freezing resnet, clip embedding, blip embedding and 8-bit optimizer together,but V100 32G still doesn't work. The only successful case I saw was freezing resnet, clip embedding, blip embedding, using amp and 8-bit optimizer together helped reduce the vRAM to about 40GB onA6000 48G.
I also tried the same process, but I thought the other parameters should be scaleable as well in somehow without messing the model.
I also tried the same process, but I thought the other parameters should be scaleable as well in somehow without messing the model.
If you have any progress, I would be happy if you could tell me about your successful parameter configuration.
With thw default setting I don't think it is possible to train at 512*512 using 40G A100s. Still, it is a bit strange the authors don't freeze the CLIP, BLIP net.
Anyways, with freezing CLIP, BLIP, and resnet, you still go tons of parameters of the cross-attention you can play with, and this might be enough already. (still waiting to check my ckpt)
@TimandXiyu. Are your checkpoints ready ? If possible, will you be ready to share them.
@TimandXiyu. Are your checkpoints ready ? If possible, will you be ready to share them.
@TimandXiyu. Are your checkpoints ready ? If possible, will you be ready to share them. Can you explain, how to learn ARLDM with one available CUDA index. How to beat CUDA out of memory error using CLIP, BLIP, RESNET freezing or other methodics.
How to reduce the parameters size to fit model into gpus, I have already tried 16bit precision but also need to scale the model