kohya-ss / sd-scripts

Apache License 2.0
4.47k stars 754 forks source link

Multi GPU on windows #1252

Open MazrimCoding opened 2 months ago

MazrimCoding commented 2 months ago

has anyone successfully got this running?

No combination of accelerator settings or even changing the script to use gloo has let me successfully fully run the script.

I did get to the point of both GPUs being loaded however the script hit an error "RuntimeError: Trying to create tensor with negative dimension" which I was not able to further troubleshoot.

Wondering if it is just not compatible for now basically, or even sensible as I have heard multi GPU for image training setups have issues with the seed not being properly shared between them.

BootsofLagrangian commented 2 months ago

Can you attach your environments? Number of GPUs, configuration of accelerate, installed python libraries, training configuration, script for running one of sd-scripts, etc..

If you provide as detail as, you will be able to get more clear solution.

kohya-ss commented 2 months ago

According to this issue, PyTorch 2.2.1 seems to work with train_network.py. https://github.com/pytorch/pytorch/issues/116056

If 2.2.1 doesn't work, please share more details.