Open LordMilutin opened 1 week ago
Seems that you are using some random unofficial version of RVC. Please try this repo first. We will not solve the problem that not from this repo.
No, I am using this one. I have just named my docker container voice-clone so I can more easily enter it and inspect it. Here is the output:
voice-clone | INFO:Micy:loaded pretrained assets/pretrained_v2/f0G40k.pth
voice-clone | INFO:Micy:<All keys matched successfully>
voice-clone | INFO:Micy:loaded pretrained assets/pretrained_v2/f0D40k.pth
voice-clone | INFO:Micy:<All keys matched successfully>
voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support.
voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support.
voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support.
voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support.
voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support.
voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support.
voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support.
voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support.
voice-clone | /usr/local/lib/python3.10/dist-packages/torch/autograd/graph.py:744: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
voice-clone | grad.sizes() = [64, 1, 4], strides() = [4, 1, 1]
voice-clone | bucket_view.sizes() = [64, 1, 4], strides() = [4, 4, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:325.)
voice-clone | return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
voice-clone | INFO:Micy:Train Epoch: 1 [0%]
voice-clone | INFO:Micy:[0, 0.0001]
voice-clone | INFO:Micy:loss_disc=3.908, loss_gen=2.790, loss_fm=18.868,loss_mel=24.102, loss_kl=9.000
voice-clone | DEBUG:matplotlib:matplotlib data path: /usr/local/lib/python3.10/dist-packages/matplotlib/mpl-data
voice-clone | DEBUG:matplotlib:CONFIGDIR=/root/.config/matplotlib
voice-clone | DEBUG:matplotlib:interactive is False
voice-clone | DEBUG:matplotlib:platform is linux
voice-clone | Process Process-1:
voice-clone | Traceback (most recent call last):
voice-clone | File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
voice-clone | self.run()
voice-clone | File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
voice-clone | self._target(*self._args, **self._kwargs)
voice-clone | File "/app/infer/modules/train/train.py", line 278, in run
voice-clone | train_and_evaluate(
voice-clone | File "/app/infer/modules/train/train.py", line 508, in train_and_evaluate
voice-clone | scaler.scale(loss_gen_all).backward()
voice-clone | File "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py", line 525, in backward
voice-clone | torch.autograd.backward(
voice-clone | File "/usr/local/lib/python3.10/dist-packages/torch/autograd/__init__.py", line 267, in backward
voice-clone | _engine_run_backward(
voice-clone | File "/usr/local/lib/python3.10/dist-packages/torch/autograd/graph.py", line 744, in _engine_run_backward
voice-clone | return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
voice-clone | RuntimeError: CUDA error: out of memory
voice-clone | CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
voice-clone | For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
voice-clone | Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
voice-clone |
voice-clone | /usr/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 4 leaked semaphore objects to clean up at shutdown
voice-clone | warnings.warn('resource_tracker: There appear to be %d '
Here is the GPU usage log. As you can see, it never goes above 4GB, which is very weird. I have stable diffusion container that is using 7GB without any problems and it works...
Sat Jun 29 14:10:02 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 16W / 130W | 562MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:03 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 15W / 130W | 562MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:04 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 16W / 130W | 562MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:05 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 16W / 130W | 562MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:06 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 49C P2 26W / 130W | 791MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:07 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 50C P2 33W / 130W | 1357MiB / 8192MiB | 11% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:08 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 50C P2 37W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:09 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 50C P2 38W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:10 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 50C P2 33W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:11 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 50C P2 32W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:12 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P5 23W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:13 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 16W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:14 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 16W / 130W | 1885MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:15 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 15W / 130W | 1887MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:16 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P8 16W / 130W | 1967MiB / 8192MiB | 1% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:17 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 65% 48C P8 16W / 130W | 2135MiB / 8192MiB | 46% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:18 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 65% 49C P2 23W / 130W | 2577MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:19 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 65% 50C P2 35W / 130W | 2759MiB / 8192MiB | 39% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:20 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 49C P2 34W / 130W | 3053MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:21 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 66% 48C P3 25W / 130W | 3055MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:22 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 65% 50C P2 34W / 130W | 3243MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:23 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 65% 50C P2 38W / 130W | 1903MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
| 0 N/A N/A 3828 C /python3.10 N/A |
+-----------------------------------------------------------------------------------------+
Sat Jun 29 14:10:24 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 On | 00000000:01:00.0 On | N/A |
| 65% 48C P3 31W / 130W | 562MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 35 G /Xwayland N/A |
| 0 N/A N/A 38 G /Xwayland N/A |
| 0 N/A N/A 368 G /Xwayland N/A |
+-----------------------------------------------------------------------------------------+
No, I am using this one. I have just named my docker container voice-clone so I can more easily enter it and inspect it. Here is the output:
voice-clone | INFO:Micy:loaded pretrained assets/pretrained_v2/f0G40k.pth voice-clone | INFO:Micy:<All keys matched successfully> voice-clone | INFO:Micy:loaded pretrained assets/pretrained_v2/f0D40k.pth voice-clone | INFO:Micy:<All keys matched successfully> voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support. voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support. voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support. voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support. voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support. voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support. voice-clone | DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU voice-clone | INFO:faiss.loader:Loading faiss with AVX2 support. voice-clone | INFO:faiss.loader:Successfully loaded faiss with AVX2 support. voice-clone | /usr/local/lib/python3.10/dist-packages/torch/autograd/graph.py:744: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. voice-clone | grad.sizes() = [64, 1, 4], strides() = [4, 1, 1] voice-clone | bucket_view.sizes() = [64, 1, 4], strides() = [4, 4, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:325.) voice-clone | return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass voice-clone | INFO:Micy:Train Epoch: 1 [0%] voice-clone | INFO:Micy:[0, 0.0001] voice-clone | INFO:Micy:loss_disc=3.908, loss_gen=2.790, loss_fm=18.868,loss_mel=24.102, loss_kl=9.000 voice-clone | DEBUG:matplotlib:matplotlib data path: /usr/local/lib/python3.10/dist-packages/matplotlib/mpl-data voice-clone | DEBUG:matplotlib:CONFIGDIR=/root/.config/matplotlib voice-clone | DEBUG:matplotlib:interactive is False voice-clone | DEBUG:matplotlib:platform is linux voice-clone | Process Process-1: voice-clone | Traceback (most recent call last): voice-clone | File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap voice-clone | self.run() voice-clone | File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run voice-clone | self._target(*self._args, **self._kwargs) voice-clone | File "/app/infer/modules/train/train.py", line 278, in run voice-clone | train_and_evaluate( voice-clone | File "/app/infer/modules/train/train.py", line 508, in train_and_evaluate voice-clone | scaler.scale(loss_gen_all).backward() voice-clone | File "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py", line 525, in backward voice-clone | torch.autograd.backward( voice-clone | File "/usr/local/lib/python3.10/dist-packages/torch/autograd/__init__.py", line 267, in backward voice-clone | _engine_run_backward( voice-clone | File "/usr/local/lib/python3.10/dist-packages/torch/autograd/graph.py", line 744, in _engine_run_backward voice-clone | return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass voice-clone | RuntimeError: CUDA error: out of memory voice-clone | CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. voice-clone | For debugging consider passing CUDA_LAUNCH_BLOCKING=1. voice-clone | Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. voice-clone | voice-clone | /usr/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 4 leaked semaphore objects to clean up at shutdown voice-clone | warnings.warn('resource_tracker: There appear to be %d '
It would appear that you need a special torch version? The torch version usually is chosen based on what your system has available (i.e., if you have CUDA available you get the torch version with CUDA support, etc) Have you tried manually cloning and building the torch python module wheel and installing it within the container? This will give more insight on whether it's a torch bug.
Hello! I am having issues with training the model. This is all running in a docker-container. Here is the first issue when training index:
And after I try to train the model, I get this output:
However, this is wrong as my GPU has 8GB but while monitoring its usage, it never goes above 4GB: