Asychronous multiple GPU multiple model training with shared replay buffer

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

https://huggingface.co/docs/diffusers

Apache License 2.0

25.17k stars 5.2k forks source link

Asychronous multiple GPU multiple model training with shared replay buffer #151

Closed richardrl closed 2 years ago

richardrl commented 2 years ago

Let's say we have multiple diffusion models, as in the Cascading diffusion model paper.

Is there an easy way to setup training such that each conditional model is trained simultaneously on different GPUs? What about with a shared replay buffer, that each conditional model can access?

anton-l commented 2 years ago

Hi @richardrl! For now the easiest way to train a multi-stage pipeline is to run different training scripts for each stage (e.g. train a super-resolution diffusion model just on a dataset of LR-HR images), similar to Glide, Imagen, etc. The augmentations mentioned in the Cascaded Diffusion paper should give a nice boost too.

patrickvonplaten commented 2 years ago

Closing now @richardrl - please ping us if your question hasn't been answered well enough in your opinion.