Closed floschne closed 10 months ago
Hi!
Happy to hear you like this project.
Some of the LLMs that I used (like the T5 models) have to use bf16 to work correctly and the LAVIS implementation used something like this to give correct results even if you otherwise train with fp16 (https://github.com/salesforce/LAVIS/blob/main/lavis/models/blip2_models/blip2_t5.py). You could probably remove it and it works for most models or use bf16 in the lightning config for precision.
Ah okay! Got it, thanks for the explanation. :)
Hi! Thanks for publishing this awesome work, it's very inspirational for me :)
I am just trying to understand your codebase and have a question regarding the quantized / mixed precision training with lightning:
In line https://github.com/gregor-ge/mBLIP/blob/main/src/modules/modeling/mblip.py#L485: Why do you have to use autocast here? Isn't it automatically applied by lightning?