afiaka87 / clip-guided-diffusion

A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
MIT License
457 stars 62 forks source link

Multi GPU Support #16

Closed rlallen-nps closed 3 years ago

rlallen-nps commented 3 years ago

Any thoughts on building multi GPU support via dataparallel?

afiaka87 commented 3 years ago

@rlallen-nps sounds interesting; unfortunately I don't have access to the compute so am a bit lacking in motivation to get it working.

I'm not entirely certain how multi-GPU inference would work with this code base. It seems like a lot of work when there's presently a device argument which should allow you to run inference on multiple GPUs; just toward different generations.

rlallen-nps commented 3 years ago

Check out datacrunch.io for cheap GPUs. The point of distributing one run over multiple GPUs is not to process more images, it's to process one generation at a much higher resolution.

afiaka87 commented 3 years ago

Check out datacrunch.io for cheap GPUs.

I'm aware but like I said; not super motivated by this. If I were to rent one it would be to max out settings for a single GPU - easily achievable on an RTX3090 or an A100 but I don't have any intention at this time of spending money on this project. I do all the testing from my RTX 2070 that I don't have to pay rent for fortunately.

The point of distributing one run over multiple GPUs is not to process more images, it's to process one generation at a much higher resolution.

The guided-diffusion checkpoints have harsh size constraints and must be trained from scratch for different sizes. The largest size is the 512 pixel checkpoint (which Katherine Crowson finetuned to be unconditioned).

The code for guided-diffusion (from OpenAI's fork) uses MPI for distributed I think? If you wanted to increase the resolution of the generations though, that's where I would go.

afiaka87 commented 3 years ago

I believe https://github.com/AranKomat/Diff-DALLE is also looking into training guided-diffusion on a transformer in the style of DALLE; there will definitely be interesting developments from that repository in the coming months and I fully expect it to surpass this method and for a few good checkpoints to be released as well.

rlallen-nps commented 3 years ago

The guided-diffusion checkpoints have harsh size constraints and must be trained from scratch for different sizes. The largest size is the 512 pixel checkpoint (which Katherine Crowson finetuned to be unconditioned).

Ah, I see; I was still thinking in VQGAN mode. I'll let you know if I find anything interesting and thanks for the Diff-DALLE recommendation!