d8ahazard / sd_dreambooth_extension

Other
1.85k stars 281 forks source link

[Bug]: Model quality destroyed after generating a CKPT file #1396

Closed shalevc1098 closed 6 months ago

shalevc1098 commented 7 months ago

Is there an existing issue for this?

What happened?

For every model I've trained in the latest version of the extension and generated a CKPT file for, it's quality was destroyed.

Sample image generated before CKPT export: image

Image generated using the exported CKPT file: image

Another image using the same CKPT file of another person which was pre-trained on the model itself: image

Steps to reproduce the problem

Just generate any trained model as CKPT on the latest extension's version.

Commit and libraries

None

Command Line Arguments

None

Console logs

There aren't any relevant information there.

Additional information

Please fix this issue because I can't make my models look well without a fix. Thanks :)

Karlotos commented 7 months ago

I am having the exact same issue I think. Any model I make in dreambooth is ruined upon creation. For example, if I try generating classification images on a freshly created Dreambooth model, having done no training or anything else at all, all classification images generated look hugely over-trained and broken - even worse than the examples above.

Would greatly appricate a fix.

kylesk42 commented 7 months ago

Have you guys noticed any generation speed difference in the current version? The older version was decent on my laptop 2080. Now i have a 3090 and its fast with kohya, but with this dreambooth it takes seconds per it instead of it per second,

Karlotos commented 7 months ago

Have you guys noticed any generation speed difference in the current version? The older version was decent on my laptop 2080. Now i have a 3090 and its fast with kohya, but with this dreambooth it takes seconds per it instead of it per second

I changed my GPU at the same time I updated so I'm afraid I have no idea as I can't directly compare anymore.

shalevc1098 commented 7 months ago

Have you guys noticed any generation speed difference in the current version? The older version was decent on my laptop 2080. Now i have a 3090 and its fast with kohya, but with this dreambooth it takes seconds per it instead of it per second,

for me in sdxl kohya with a 4090 for some reason it take 2-3/s per it. you know how to speed this up?

d8ahazard commented 7 months ago

I just pushed a new update to main that fixes some mis-named keys when converting from diffusers to safetensors. Please give a try now and let me know the results.

Karlotos commented 7 months ago

Thanks for the update. It doesn't seem to have fixed the problem I was having though, where some models create ruined class images before training even begins. I don't know if my problem is the exact same issue as OP's though, although it seems very similar. (Disclaimer: I'm not a programmer, just doing my best to learn, so it's possible Problem Exists Between Chair And Keyboard).

Testing is complicated by the fact that I'm also having another major issue where I can't get Dreambooth to work offline. I go online, do a fresh install, get all the dependecies, test it all (make a model, run training etc...) but then when I try to replicate the exact same steps offline I get an error as soon as I try to make a model. It is supposed to work offline once you've installed everything it needs, isn't it?

Extracting config from C:\Users\User1\Desktop\Stable-Diffusion\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\..\configs\v1-training-unfrozen.yaml
Extracting checkpoint from C:\Users\User1\Desktop\Stable-Diffusion\stable-diffusion-webui\models\Stable-diffusion\ExampleSourceModel.safetensors
Something went wrong, removing model directory
Traceback (most recent call last):
  File "C:\Users\User1\Desktop\Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\urllib3\connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "C:\Users\User1\Desktop\Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\urllib3\util\connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed

I've had the same error trying to export a checkpoint from a model I trained. Will try to open a new issue with logs.

github-actions[bot] commented 7 months ago

This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days