huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.4k stars 5.26k forks source link

Add a section about how training data resizing might affect the quality of the end models #6397

Open sayakpaul opened 9 months ago

sayakpaul commented 9 months ago

Have been chatting with @bghira on our Discord forum about the negative effects of not carefully resizing the training data.

@bghira noticed that the way we resize the images in our training examples can introduce unwanted artifacts in the samples generated by the end fine-tuned model. In short, it ruins the fidelity of the generated samples.

We don't want to make the data input pipelines in our training examples super sophisticated, guaranteeing SoTA results. But that said, I think it's best in the community's interest if we made a note about this phenomenon in the READMEs and just added a comment about it in the pipelines.

Curious to know what @patil-suraj @apolinario think about this. Personally, I think it makes sense.

I will let @bghira share some more insights as well.

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 8 months ago

Not stale.

github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.