Nerogar / OneTrainer

OneTrainer is a one-stop solution for all your stable diffusion training needs.
GNU Affero General Public License v3.0
1.34k stars 110 forks source link

[Bug]: Black sample images #305

Closed ntrouve-onera closed 1 month ago

ntrouve-onera commented 1 month ago

What happened?

Running base SDXL fine tune on a 3090, training appears to be going fine and decently fast (getting 3-4it/s). But I get black samples and this error when sampling happen "Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference."

I assume its trying to perform the sampling on cpu? did not find how to change that type in the option.

What did you expect would happen?

I expected to see samples images but get full black image instead.

Relevant log output

No response

Output of pip freeze

No response

mx commented 1 month ago

The CPU warning is superfluous, it's because the models are temporarily stashed there. I suspect you are getting black samples because of a setting error causing NaNs. Without further details on your configuration, we could not possibly debug this.