lucidrains / imagen-pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
MIT License
8.05k stars 763 forks source link

Support for Re-Imagen #306

Open monatis opened 1 year ago

monatis commented 1 year ago

The followup paper Re-Imagen: Retrieval-Augmented Text-to-Image Generator accesses an external memory and prepends retrieved similar entries to the input when generating. I can implement it in a PR with Qdrant as the vector DB if you would like to support it in this repo.

lucidrains commented 1 year ago

@monatis oh hey Monatis! long time

yea i can build this, it is straightforward

why Qdrant and not just use something like faiss

monatis commented 1 year ago

Thank you @lucidrains, kudos to you for the awesome work. Unlike Faiss's working as a library, Qdrant is a complete vector DB with support for distributed mode, filterable payload of rich data types, multiple vectors per record (say paired image and text vectors stored in the same data point) etc. I believe that it can open up more research opportunities in the retrieval-augmented paradigm. I'm also working to index LAION datasets in Qdrant and then share as a snapshot, and it can make it easier to start with experiments.

lucidrains commented 1 year ago

@monatis ahh i see, i think it is probably best to stick with faiss given the licensing

faiss should be good enough for use in production. many companies use it

lucidrains commented 1 year ago

consulted an expert, and he gave a good suggestion to build out an interface so that any of these services besides faiss could be plugged in

I'll think about it

monatis commented 1 year ago

Yeah, fair enough. We can implement support for them on top of that interface. Btw, I'm not a licensing expert but Qdrant's Apache 2.0 license should also be fine in any production use cases I guess --there are companies using it. And Apache 2.0 may even be preferable to MIT for some companies as the former explicitly grants patent rights.

lucidrains commented 1 year ago

@monatis ohh got it, good to know! i'm not well versed with these licenses, all i know is MIT is the most unrestrictive

jacobwjs commented 1 year ago

@monatis This would be a great feature. I'm really interested in this. Any updates?

Nana2929 commented 11 months ago

I would be appreciating it if Re-imagen could be implemented and integrated. Looking forward to the release.

niatzt commented 8 months ago

Looking forward to the implementation.