Higher resolution output

Hello! We only use the first two stages of DeepFloyd IF, which goes up to 256x256. DeepFloyd IF also has a third stage (which is just the stable diffusion upsampler), which makes 1024x1024 images. You could try incorporating that, but you might not get great results because it's a latent diffusion model (our method works better for pixel diffusion models. You can check the paper for details). The only other pixel diffusion model that I'm aware of is Imagen, which makes images up to size 1024x1024. Unfortunately Imagen is not available publicly. If there are other pixel-based diffusion models, I would expect them to work with our method (if you know of any I would love to know as well!).

dangeng / visual_anagrams

Higher resolution output #8