Closed ShuxunoO closed 11 months ago
One way you can do this with no code change is serve your images over http for example with nginx , build a dataset of urls pointing to your nginx, then you can use the whole pipeline, from img2dataset to clip retrieval inference, index then the back
For supporting custom clip checkpoint, I think we'd need a new option, maybe you can add it ?
A new issue to talk about “How to load the custom clip checkpoint” or A new function I need to write by myself?
A new feature you could contribute yes. I don't think it's a lot of work. Openclip already supports it so it's really only a matter of passing through the path from a clip retrieval option to openclip
Yes, just need to replace these sentences
device = torch.device("cuda:3" if torch.cuda.is_available() else "cpu")
pretrained_model_path = "path/to/your/finetuned_model.pth"
model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-32', pretrained=pretrained_model_path, device=device)
tokenizer = open_clip.get_tokenizer('ViT-B-32')
model.to(device)
When I finish my thesis before the deadline I think I will
it is added now
Yes, just need to replace these sentences
device = torch.device("cuda:3" if torch.cuda.is_available() else "cpu") pretrained_model_path = "path/to/your/finetuned_model.pth" model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-32', pretrained=pretrained_model_path, device=device) tokenizer = open_clip.get_tokenizer('ViT-B-32') model.to(device)
When I finish my thesis before the deadline I think I will
where to replace it?
Hello, the project is a really great work!
I built a custom img-txt pair dataset,including 7.45Million paires, and have finetuned CLIP-B-32, CLIP-L-14 model on it.
Now, I want to use the framework to search the img by query caption. My question is :
**1. How can I extract the img and corresponding caption feature tensor efficiently?
Is there any tutorial or reference link?
Thank you for your help!