-
So I implemented a text-to-image search where I query a text - text encoder then image through the image encoder and retrieve top images for the query but it doesn't work well with CLOOB?
What is t…
-
motivation here is to add a layer of abstraction to facilitate integrating new perceptors like cloob, SLIP, BLIP, etc. ultimately want this to be an independent library, see https://github.com/Eleuthe…
dmarx updated
2 years ago
-
First of all, great work!!
I strongly believe this model has made big contribution to the Vision-and-Language community in Japan.
I find there is no description about the initialization of the vis…
-
Hi, I am about to train my own cloob latent diffusion and would like to confirm this is right.
`autoencoder_scale` in your example was about 100 but I got something like 6.85.
It depends on the tr…
-
First amazing that you try to recreate this.
My question is are you going to plan to use your X-CLIP implementation or just use the basic OpenAI vanilla CLIP. From what I gathered from the official …
-
Great work!
Is it possible to have both pretrained model and configuration files to test the notebook? I found models and datasets in:
https://ml.jku.at/research/CLOOB/downloads
but when I run the …
-
Any insight on how to take the image/text embeddings (or nominal model forward output) to achieve a simple similarity score as done in the huggingface implementation? [HF example here](https://hugging…
-
Also to do: add tests for this model to MLF test script
```
2022-05-07 19:33:23.102 | INFO | __main__:parse_scenes:133 - Prompts loaded.
2022-05-07 19:33:23.110 | INFO | __main__:do_run:5…
dmarx updated
2 years ago
-
![image](https://user-images.githubusercontent.com/66733562/154802886-52d01a24-9a68-4a2d-b8e3-24353a8a722f.png)
Any idea why this import isnt working?
-
Why is the existing code for the Modern Hopfield Net (which has its own Github repository) not used here? And if I wanted to use it instead, what arguments would I have to call it with to get the same…