Diffusers and/or torch versions causing issues with dreambooth

zap95421 commented 1 year ago

Describe the bug

Because this is a fork of the huggingface diffusers, any updates to their diffusers version will affect this repo. Huggingface recently updated their diffusers version to 0.10.2. With this update, xformers is no longer automatically enabled see link here

If you update your diffusers to 0.10.2, you will most likely get one of two errors: -memory error -GPU not found

I am looking into how/where to re-enable xformers but I wanted to create this report in case anyone else was experiencing this issue.

Right now, Shivam's code seems to be based on diffusers v0.9.0 whereas installing diffusers will install 0.10.2 and thus cause errors.

EDIT: If you "pip install ." within Sivam's fork, it installs v0.9.0. If you "pip install ." within the hugging face, it installs v0.10.2.

As of now, I've resolved the issue by doing the following: $ conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge

then compiled xformers using $ conda install xformers -c xformers/label/dev

This is running on Windows 10, within Ubuntu. The anaconda environment is running Python 3.9.15.

Reproduction

No response

Logs

No response

System Info

diffusers versions 0.9.0 vs 0.10.2

linyu0219 commented 1 year ago

pipe.enable_xformers_memory_efficient_attention() may be help

zap95421 commented 1 year ago

after doing some more messing around, it looks like it's possibly a mixture of issues: pytorch, CUDA versions, xformers. For right now, to continue using Dreambooth, I'd recommend staying with Diffusers v0.9.0

zap95421 commented 1 year ago

I've been running into issues training now with diffusers 0.9.0 and xformers 0.0.15. I'm using ubuntu and running dreambooth in a virtualized environment since it runs a bit more efficiently, but my products still don't come out as clean as they once did. The loss rate is a consistent ~0.2 even if I ramp up the steps/learning rate. After 5000 steps at 1e-6 or after 2000 steps at 5e-6, loss goes up initially then steadies out at 0.2. The resulting models produce worse results compared to older versions. Still trying to find out the issue.

zap95421 commented 1 year ago

My specs are as follows:

3080TI 12GB
Windows using Ubuntu
Python 3.9.15
Diffusers 0.9.0

Torch seems to be one of the issues. I ran this command $ conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge

then compiled xformers using $ conda install xformers -c xformers/label/dev

Now everything seems to be working again.

Camrule commented 1 year ago

Hey @zap95421

So glad it is not just me thinking this! I've not been able to replicate my model quality from a few months ago and been scratching my head about it for a while. I started wondering about the diffuser model version a few weeks ago and happy to hear you have had similar issues and may have a solution. It was about late September early October that I noticed the change in quality. What about you? Must have been diffuser=0.7.2

Have you grabbed any older commits here and used those requirement.txt files for them? Might try this myself soon

zap95421 commented 1 year ago

Hey @zap95421

So glad it is not just me thinking this! I've not been able to replicate my model quality from a few months ago and been scratching my head about it for a while. I started wondering about the diffuser model version a few weeks ago and happy to hear you have had similar issues and may have a solution. It was about late September early October that I noticed the change in quality. What about you? Must have been diffuser=0.7.2

Have you grabbed any older commits here and used those requirement.txt files for them? Might try this myself soon

So I've had consistent results after doing my fix I edited in my post. I haven't really strayed too much after and have stuck with the current commit, no need to checkout earlier ones. That's why I'm thinking it's something either with PyTorch, xformers, or how it's being implemented. I haven't done a "git pull" since posting my issue, but I fixed the problem with whatever that current commit was.

Now I just wish Shivam would add support for ratio bucketing because honestly this dreambooth produces the most pleasing and consistent results for me with a 3080ti. I've done the webui, kona_ss (unsure about spelling), and stable tuner. But this fork is personally a the best

ShivamShrirao / diffusers