Closed pinballelectronica closed 11 months ago
I gave up on making this work on Windows with CUDA. It was like hours of struggle for no real benefit considering I can run a Docker container or WSL 2. I have no idea why I tried to hard with WIndows. Even in a conda environs it was a nightmare. One thing fixed, another thing broke.
So I ran it with WSL 2 and I got it working in like 2 minutes. A few notes:
both Windows and Ubuntu 20.04 (WSL 2) needed onnxruntime which was not in the reqs.
FWIW w/r/t CUDA, I'm running 12.1 which is passed through WSL via Docker Desktop. I am unable to run this on anything smaller than a 4090 (for my use case). I think the memory requirements are right around exactly 22GB, at least for the input I provided which was a 675Mb Wav file and the V2 large. The speed is dramatically faster by about 10x as opposed to running it on CPU (i9-12900k), 16 threads give me ~100FPS where where as below it was blazing fast on GPU:
downgrade to cuda 11.8
pytorch still bad with cuda 12
I use CUDA all day long with a 4090- This seems to be an outlier.
File "C:\Users\dave\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda__init__.py", line 211, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
I have a very fast CPU(s) so the inference was good enough for me to not notice (sub 30m large-v2 for 1.5 hour video). Alas, my 4090 wants a part of the action. I cannot for the life of me figure this out.
Py 3.10.10 CUDA 12.1 in path added all the requirements.txt (had most)
using --device "cuda:0" (literally, with the quotes)
Thanks