Closed lnexenl closed 3 months ago
I have not. Probably lora would be the easiest way to go. Roma is already quite expensive to train so didnt look into this. You could also try just unfreezing soke layers near the end.
I also think it costs lots of GPUs to train, but I doesn't have cards with more than 32GB VRAM, so I decide to keep it original. 😆
torch.amp is commonly used in the model and it obviously decreases the VRAM usage and increases speed. The DINOv2 weight is frozen in RoMa, and the output of float16 version DINOv2 might be slightly different from float32 version. I am wondering that have you ever tried to train DINOv2 together?