Closed KryptixOne closed 1 year ago
Would you be able to share your edited version of custom_ops.py
? I'm having some trouble getting it to work and think I'm missing something.
Hi @nuclearsugar,
Sure the version I used is here
I should mention that I noticed that when importing the C++ extensions, if you stop the script during import, it can become corrupted. At this point, you will need to clear your torch_extensions folder. Build them again with GPU=1, then, once they are built. Follow the above instructions and run again with GPU=2
Thanks for sharing the edited version of custom_ops.py
. That fixed it!
When attempting to run this implementation of StyleGAN3, you may run into stalling when attempting to train. This is due to the initial GPU:0 locking out access to bias_act.pyd and similiar StyleGAN3 custom cuda kernels plugins.
To fix this and enable training, you must create duplicate but standalone files of these plugins.
I did this in the following way for 2 GPUs:
in the custom_ops.py file starting after line 138:
Now navigate to the following directory:
C:\Users\"YourUserName"\AppData\Local\torch_extensions\torch_extensions\Cache\py38_cu116\
Here, you will find 3 plugin folders that were created during your initial attempt. Create copies of these folders in the same directory but add "_2" to the end of each folder name.
Now, when you run the model with multiple GPUs, once the model encounters the situation where a plugin load has failed, it will try the secondary plugin that you have just created and will succeed.
Feel free to automate this process if you are running with GPUs >2 but this workaround should work. Note that the initial setup may take about 1-5min longer than before.