Add multi aspect ratio, and multi crop SCM training script

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

https://huggingface.co/docs/diffusers

Apache License 2.0

23.99k stars 4.94k forks source link

Add multi aspect ratio, and multi crop SCM training script #5876

Open nbardy opened 7 months ago

nbardy commented 7 months ago

None of the diffusers training scripts support multiple crops or aspect ratios in the data loader. This is mentioned in the tech report for SDXL as a part of their training process.

In my tests diffusion model struggle to do high resolution details. I did this for base SD 1.5 and was planning to port the code to the SDXL SCM scripts.

@sayakpaul I know you wanted to keep the base training scripts simple. Wanted to open a discussion here on whether to upstream. Is this something I should upstream? or better as something like a community script?

The SCM models are fantastic, but full short of the base SDXL model on high resolution details. Hoping this can fix some of that up.

References

SCM vs LDMs inference report ( Need to add multiple aspect ratios to inference report) https://wandb.ai/nbardy-facet/LCM-Inference-Test/reports/SDXL--Vmlldzo2MDAzNDU1

SCM LORA 4 steps

Base SDXL 30 steps

SDXL technical report aspect ratios.

patrickvonplaten commented 7 months ago

cc @patil-suraj wdty?

github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patil-suraj commented 6 months ago

Hey @nbardy ! This would be awesome to have, to start with we could add a new script under research_projects.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 5 months ago

@nbardy still happy to accept a contribution of your script to research_projects. If the community finds it to be more and more useful we can think of graduating it to be an officially supported script.

We will of course include your script in the documentation and communicate about it.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu commented 4 months ago

gentle pin @nbardy let us know if you're still interested in adding this to research folder

nbardy commented 4 months ago

Thanks for the ping. I have a version of this on a fork I can clean up for upstream

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 3 months ago

We are still very interested in this @nbardy!

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.