Closed Zyin055 closed 4 months ago
Is the same thing happening to me, but I don't have this "triton" downloaded, if I download it can the problem be over?
Do you have CUDA Toolkit installed? If not try installing it. https://developer.nvidia.com/cuda-11.3.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exe_local I had the same issue and solved it by installing this.
yes i have :/
the same issue here, I think Triton is the culprit?
DONE! is working and training now!
same here. you just need to delete the "triton" folders, no need to reinstall all. ps: cuda is installed.
I will add a not about Triton... I added it as an option because some people complained about the error message... but apparently this custom Triton build does not help and actually make things worst...
Was just struggling with this same problem, though in my case the traceback failed for "RuntimeError: Cannot find ptxas". Deleting the triton folders sees to have fixed it, at least for now.
Edit: Maybe my celebration was premature, now getting "CUDA out of memory" errors, PyTorch is filling up my memory for some reason.
Edit edit: Seems to be working now, possibly fixed by reinstalling CUDA and drivers again
Edit: Maybe my celebration was premature, now getting "CUDA out of memory" errors, PyTorch is filling up my memory for some reason.
This happened to me during testing while trying to figure out the Triton issue, turned out I had the Dreambooth tab open instead of the LoRA tab
Edit: Maybe my celebration was premature, now getting "CUDA out of memory" errors, PyTorch is filling up my memory for some reason.
This happened to me during testing while trying to figure out the Triton issue, turned out I had the Dreambooth tab open instead of the LoRA tab
Oh shoot, that might have been my problem too, I don't remember if I switched to the LoRA tab after reloading everything else.
When making a new install with version v23.0.15 and installing Triton 2.1.0 for Windows via the setup wizard step 3, I get an error when trying to train a LoRA when using a known working config file for both SD1.5 and SDXL.
This error only happens after installing Triton 2.1.0 for Windows. It worked fine when only doing steps 1 (install) and 2 (CuDNN files) in the setup wizard. Step 3 (Triton) is what broke it.
Windows 10 RTX 3060 12GB 48GB RAM Python 3.10.9 (had to upgrade from 3.10.6 for this)
PS what does Triton even do? Is it worth it for me to try and resolve this issue?