voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.
Other
8.71k stars 1.16k forks source link

No way to select which GPU to run on. Runs on none of them. #609

Closed PlateGlassArmour closed 1 year ago

PlateGlassArmour commented 1 year ago

I'm trying to train a model, and everything else seems to be working, but I haven't found any way to select which of my two GPU's to use, and it seems to refuse to use either one for training.

I run Windows 10 pro, and I have a GTX3060 and a GTX3090 installed.

When I run "nvcc --version" I get this. (base) PS C:\WINDOWS\system32> nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0

And when I run "nvidia-smi" I get this. Wed May 10 15:53:07 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 531.14 Driver Version: 531.14 CUDA Version: 12.1 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3090 WDDM | 00000000:25:00.0 Off | N/A | | 0% 38C P8 6W / 350W| 0MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce RTX 3060 WDDM | 00000000:26:00.0 On | N/A | | 0% 52C P8 17W / 170W| 325MiB / 12288MiB | 3% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

But when I try to train my model, it only uses CPU/RAM, not my GPU/VRAM (from either GPU).

Am I just fucking something up? Is there actually a method to select which GPU to use that I'm just missing? I've tried updating just about everything, and nothing seems to help.

34j commented 1 year ago

Need more info (could you paste the output?

PlateGlassArmour commented 1 year ago

Sure. Here's the output when I try to train. It seems to work just fine, other than not detecting my GPU(s)

`Windows PowerShell Copyright (C) Microsoft Corporation. All rights reserved.

Try the new cross-platform PowerShell https://aka.ms/pscore6

PS C:\Users\Usr> cd C:\Users\Usr\Documents\GitHub\so-vits-svc-fork PS C:\Users\Usr\Documents\GitHub\so-vits-svc-fork> svc train [23:01:53] WARNING [23:01:53] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: [WinError 127] The specified procedure could not be found warnings.py:109 warn(f"Failed to load image Python extension: {e}")

[23:02:02] INFO [23:02:02] Using strategy: auto train.py:88 INFO: GPU available: False, used: False INFO [23:02:02] GPU available: False, used: False rank_zero.py:48 INFO: TPU available: False, using: 0 TPU cores INFO [23:02:02] TPU available: False, using: 0 TPU cores rank_zero.py:48 INFO: IPU available: False, using: 0 IPUs INFO [23:02:02] IPU available: False, using: 0 IPUs rank_zero.py:48 INFO: HPU available: False, using: 0 HPUs INFO [23:02:02] HPU available: False, using: 0 HPUs rank_zero.py:48 WARNING [23:02:02] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\so_vits_svc_fork\modules\synthesizers.py:81: UserWarning: Unused arguments: {'n_layers_q': 3, 'use_spectral_norm': False} warnings.py:109 warnings.warn(f"Unused arguments: {kwargs}")

       INFO     [23:02:02] Decoder type: hifi-gan                                                                                                                                                                                                      synthesizers.py:100

[23:02:03] WARNING [23:02:03] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\so_vits_svc_fork\utils.py:200: UserWarning: Keys not found in checkpoint state dict:['emb_g.weight'] warnings.py:109 warnings.warn(f"Keys not found in checkpoint state dict:" f"{not_in_from}")

       INFO     [23:02:03] Loaded checkpoint 'logs\44k\G_0.pth' (epoch 0)                                                                                                                                                                                     utils.py:261
       INFO     [23:02:03] Loaded checkpoint 'logs\44k\D_0.pth' (epoch 0)                                                                                                                                                                                     utils.py:261

┌───┬───────┬──────────────────────────┬────────┐ │ │ Name │ Type │ Params │ ├───┼───────┼──────────────────────────┼────────┤ │ 0 │ net_g │ SynthesizerTrn │ 45.2 M │ │ 1 │ net_d │ MultiPeriodDiscriminator │ 46.7 M │ └───┴───────┴──────────────────────────┴────────┘ Trainable params: 91.9 M Non-trainable params: 0 Total params: 91.9 M Total estimated model params size (MB): 367 [23:02:04] WARNING [23:02:04] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:430: PossibleUserWarning: The dataloader, val_dataloader, does not have many workers which may warnings.py:109 be a bottleneck. Consider increasing the value of the num_workers argument(try 12 which is the number of cpus on this machine) in theDataLoader` init to improve performance. rank_zero_warn(

       WARNING  [23:02:04] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\lightning\pytorch\loops\fit_loop.py:280: PossibleUserWarning: The number of training batches (15) is smaller than the logging interval                 warnings.py:109
                Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
                  rank_zero_warn(

       INFO     [23:02:04] Setting current epoch to 0                                                                                                                                                                                                         train.py:300
       INFO     [23:02:04] Setting total batch idx to 0                                                                                                                                                                                                       train.py:316
       INFO     [23:02:04] Setting global step to 0                                                                                                                                                                                                           train.py:306

Epoch 0/9999 ---------------------------------------- 0/15 0:00:00 • -:--:-- 0.00it/s v_num: 0 [23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()

[23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()

[23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()

[23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()

[23:03:00] WARNING [23:03:00] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for warnings.py:109 all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at ..\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined]

Epoch 0/9999 ---------------------------------------- 0/15 0:00:00 • -:--:-- 0.00it/s v_num: 0`

34j commented 1 year ago

Duplicate of #372, #556, etc