patrickjohncyh / fashion-clip

FashionCLIP is a CLIP-like model fine-tuned for the fashion domain.
MIT License
337 stars 37 forks source link

a bit of idea on finetuning #36

Closed aretius closed 3 months ago

aretius commented 3 months ago

Thanks a lot for the amazing work! I wanted to understand more about finetuning process.

Thanks!

vinid commented 3 months ago

Thanks!

  1. Played a pretty big role. We eventually were able to scale to 1024 examples in the batch and that pretty sure helped. We basically started training with smaller batch sizes but we saw some degradation in terms of generalization.

  2. Cost and large gpus availability was a vig factor at the time of training

aretius commented 3 months ago

Got it i understand.

vinid commented 3 months ago
aretius commented 3 months ago

Makes sense, since when i started playing around a bit i needed to add a lot of optimisation for having a batch size as large as 512. Given your experience its worth getting a larger instance with multiple GPUs to get a big batch size.

Currently i am just using standard, train+valid split and was thinking of measuring just loss here. Are you referring to a different dataset entirely? like the public ones(for my case)

vinid commented 3 months ago

Hard to say, I think I'd probably start with the batch size you can get on a standard machine and see the quality of the final model.

I'd use external datasets, even if you are training on domain-specific data you can also probably use MSCOCO just to see how much generalization power you lose