LAION-AI / dalle2-laion

Pretrained Dalle2 from laion
500 stars 65 forks source link

Add support for `cog` (WIP) #3

Closed afiaka87 closed 1 year ago

afiaka87 commented 2 years ago

This PR adds support for the tool cog which sets up a docker container for prediction.

Can be run by using:

 cog predict -i prompts="my text"

Or you can set up a flask endpoint like this:

cog build -t my-dalle2-image
docker run -d -p 5000:5000 --gpus=all my-dalle2-image
curl http://localhost:5000/predictions -X POST -H "Content-Type: application/json" \
  -d '{"input": {
    "text_input": "...",
    "prior_num_candidates": "...",
    "prior_guidance_scale": "...",
    "img_decoder_num_generations": "...",
    "decoder_guidance_scale": "..."
  }}'

I intend to add some docs to the README for this as well as fixing some bugs and getting the CLIP rerank working.

Veldrovive commented 2 years ago

Great addition! Just to let you know simple top 1 reranking is built in to prior sampling using the num_samples_per_batch parameter. I see you are using a more complicated reranking scheme so it's not quite the same, but that is a simpler method. Also, since the default num_samples_per_batch is 2, reranking 20 prior samples times is actually the top of 40 embeddings generated. The notebook also does that, but if doing manual reranking I'd suggest setting that to 1 just so it's honest about how many are being reranked.

afiaka87 commented 2 years ago

@Veldrovive thanks ha, that makes much more sense.

afiaka87 commented 2 years ago

I currently have an unpushed version that adds support for upscaling using the released GLIDE upsampler - is that something we would consider adding to the main branch here until the upscaler are trained?

nousr commented 2 years ago

I currently have an unpushed version that adds support for upscaling using the released GLIDE upsampler - is that something we would consider adding to the main branch here until the upscaler are trained?

Could you share some results using the GLIDE upsampler? I experimented this one afternoon and got subpar results, but it would be awesome if you had a better implementation than I did

afiaka87 commented 2 years ago

@nousr

Sure thing. Yeah it's not the best upsampler, but it was trained specifically for 64x64 -> 256x256 and is "only" ~350M params, so that's convenient at least.

nousr commented 1 year ago

just closing this as its been stale for quite some time now...

stable diffusion happened, also we moved a bunch of stuff around in this repo -- if there's still interest ofc feel free to re-open