Open BradNeuberg opened 3 months ago
I believe you used ITRA to finetune your model from the OpenCLIP LAION dataset. It looks like ITRA is also from your research group? Did you end up using the OpenCLIP tooling to fine tune or your own ITRA?
Thank you very much for your attention! We performed full parameter fine-tuning by using OpenCLIP weights, in which training based on the ViT-L-14 model took 5 hours using 4 3090 GPUs. ITRA is indeed from our team, and we completed model training and evaluation based on it. : )
I'd like to know more about how you all fine tuned your model using the base OpenCLIP weights. How long did it take and what GPUs did you end up using? We are thinking about fine tuning RemoteCLIP itself with some more domain specific imagery and want to get a general sense of the cost and time it took you all to do that yourselves. Thanks :)