Closed yhosoya66 closed 4 months ago
Hi, please use this script to generate text embeddings. Make sure using the corresponding text encoder to the ViT model.
Thank you for your prompt reply, It works when I set the argument '--cache_dir' as the target pre-trained weight.
Hi, thank you for your excellent work and the well-organized code you've shared. I really appreciate it!
I'd like to ask a few questions, if that's okay.
I'm interested in fine-tuning F-ViT from CLIPSelf (available in this repository) on a different dataset. For this purpose, we need to create embedding files like 'datasets/embeddings/coco_with_background_evaclip_vitb_16.pt', which are essentially text embeddings for the target dataset, right?
Here are my questions:
Thanks for your help.