ChenDelong1999 / RemoteCLIP

🛰️ Official repository of paper "RemoteCLIP: A Vision Language Foundation Model for Remote Sensing" (IEEE TGRS)
https://arxiv.org/abs/2306.11029
Apache License 2.0
228 stars 13 forks source link

Fine tuning training time? #24

Open BradNeuberg opened 3 months ago

BradNeuberg commented 3 months ago

I'd like to know more about how you all fine tuned your model using the base OpenCLIP weights. How long did it take and what GPUs did you end up using? We are thinking about fine tuning RemoteCLIP itself with some more domain specific imagery and want to get a general sense of the cost and time it took you all to do that yourselves. Thanks :)

BradNeuberg commented 3 months ago

I believe you used ITRA to finetune your model from the OpenCLIP LAION dataset. It looks like ITRA is also from your research group? Did you end up using the OpenCLIP tooling to fine tune or your own ITRA?

gzqy1026 commented 3 months ago

Thank you very much for your attention! We performed full parameter fine-tuning by using OpenCLIP weights, in which training based on the ViT-L-14 model took 5 hours using 4 3090 GPUs. ITRA is indeed from our team, and we completed model training and evaluation based on it. : )