Open zap95421 opened 1 year ago
pipe.enable_xformers_memory_efficient_attention() may be help
after doing some more messing around, it looks like it's possibly a mixture of issues: pytorch, CUDA versions, xformers. For right now, to continue using Dreambooth, I'd recommend staying with Diffusers v0.9.0
I've been running into issues training now with diffusers 0.9.0 and xformers 0.0.15. I'm using ubuntu and running dreambooth in a virtualized environment since it runs a bit more efficiently, but my products still don't come out as clean as they once did. The loss rate is a consistent ~0.2 even if I ramp up the steps/learning rate. After 5000 steps at 1e-6 or after 2000 steps at 5e-6, loss goes up initially then steadies out at 0.2. The resulting models produce worse results compared to older versions. Still trying to find out the issue.
My specs are as follows:
Torch seems to be one of the issues. I ran this command $ conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
then compiled xformers using $ conda install xformers -c xformers/label/dev
Now everything seems to be working again.
Hey @zap95421
So glad it is not just me thinking this! I've not been able to replicate my model quality from a few months ago and been scratching my head about it for a while. I started wondering about the diffuser model version a few weeks ago and happy to hear you have had similar issues and may have a solution. It was about late September early October that I noticed the change in quality. What about you? Must have been diffuser=0.7.2
Have you grabbed any older commits here and used those requirement.txt files for them? Might try this myself soon
Hey @zap95421
So glad it is not just me thinking this! I've not been able to replicate my model quality from a few months ago and been scratching my head about it for a while. I started wondering about the diffuser model version a few weeks ago and happy to hear you have had similar issues and may have a solution. It was about late September early October that I noticed the change in quality. What about you? Must have been diffuser=0.7.2
Have you grabbed any older commits here and used those requirement.txt files for them? Might try this myself soon
So I've had consistent results after doing my fix I edited in my post. I haven't really strayed too much after and have stuck with the current commit, no need to checkout earlier ones. That's why I'm thinking it's something either with PyTorch, xformers, or how it's being implemented. I haven't done a "git pull" since posting my issue, but I fixed the problem with whatever that current commit was.
Now I just wish Shivam would add support for ratio bucketing because honestly this dreambooth produces the most pleasing and consistent results for me with a 3080ti. I've done the webui, kona_ss (unsure about spelling), and stable tuner. But this fork is personally a the best
Describe the bug
Because this is a fork of the huggingface diffusers, any updates to their diffusers version will affect this repo. Huggingface recently updated their diffusers version to 0.10.2. With this update, xformers is no longer automatically enabled see link here
If you update your diffusers to 0.10.2, you will most likely get one of two errors: -memory error -GPU not found
I am looking into how/where to re-enable xformers but I wanted to create this report in case anyone else was experiencing this issue.
Right now, Shivam's code seems to be based on diffusers v0.9.0 whereas installing diffusers will install 0.10.2 and thus cause errors.
EDIT: If you "pip install ." within Sivam's fork, it installs v0.9.0. If you "pip install ." within the hugging face, it installs v0.10.2.
As of now, I've resolved the issue by doing the following: $ conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
then compiled xformers using $ conda install xformers -c xformers/label/dev
This is running on Windows 10, within Ubuntu. The anaconda environment is running Python 3.9.15.
Reproduction
No response
Logs
No response
System Info
diffusers versions 0.9.0 vs 0.10.2