Closed PlateGlassArmour closed 1 year ago
Need more info (could you paste the output?
Sure. Here's the output when I try to train. It seems to work just fine, other than not detecting my GPU(s)
`Windows PowerShell Copyright (C) Microsoft Corporation. All rights reserved.
Try the new cross-platform PowerShell https://aka.ms/pscore6
PS C:\Users\Usr> cd C:\Users\Usr\Documents\GitHub\so-vits-svc-fork PS C:\Users\Usr\Documents\GitHub\so-vits-svc-fork> svc train [23:01:53] WARNING [23:01:53] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: [WinError 127] The specified procedure could not be found warnings.py:109 warn(f"Failed to load image Python extension: {e}")
[23:02:02] INFO [23:02:02] Using strategy: auto train.py:88 INFO: GPU available: False, used: False INFO [23:02:02] GPU available: False, used: False rank_zero.py:48 INFO: TPU available: False, using: 0 TPU cores INFO [23:02:02] TPU available: False, using: 0 TPU cores rank_zero.py:48 INFO: IPU available: False, using: 0 IPUs INFO [23:02:02] IPU available: False, using: 0 IPUs rank_zero.py:48 INFO: HPU available: False, using: 0 HPUs INFO [23:02:02] HPU available: False, using: 0 HPUs rank_zero.py:48 WARNING [23:02:02] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\so_vits_svc_fork\modules\synthesizers.py:81: UserWarning: Unused arguments: {'n_layers_q': 3, 'use_spectral_norm': False} warnings.py:109 warnings.warn(f"Unused arguments: {kwargs}")
INFO [23:02:02] Decoder type: hifi-gan synthesizers.py:100
[23:02:03] WARNING [23:02:03] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\so_vits_svc_fork\utils.py:200: UserWarning: Keys not found in checkpoint state dict:['emb_g.weight'] warnings.py:109 warnings.warn(f"Keys not found in checkpoint state dict:" f"{not_in_from}")
INFO [23:02:03] Loaded checkpoint 'logs\44k\G_0.pth' (epoch 0) utils.py:261
INFO [23:02:03] Loaded checkpoint 'logs\44k\D_0.pth' (epoch 0) utils.py:261
┌───┬───────┬──────────────────────────┬────────┐
│ │ Name │ Type │ Params │
├───┼───────┼──────────────────────────┼────────┤
│ 0 │ net_g │ SynthesizerTrn │ 45.2 M │
│ 1 │ net_d │ MultiPeriodDiscriminator │ 46.7 M │
└───┴───────┴──────────────────────────┴────────┘
Trainable params: 91.9 M
Non-trainable params: 0
Total params: 91.9 M
Total estimated model params size (MB): 367
[23:02:04] WARNING [23:02:04] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:430: PossibleUserWarning: The dataloader, val_dataloader, does not have many workers which may warnings.py:109
be a bottleneck. Consider increasing the value of the num_workers
argument(try 12 which is the number of cpus on this machine) in the
DataLoader` init to improve performance.
rank_zero_warn(
WARNING [23:02:04] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\lightning\pytorch\loops\fit_loop.py:280: PossibleUserWarning: The number of training batches (15) is smaller than the logging interval warnings.py:109
Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
rank_zero_warn(
INFO [23:02:04] Setting current epoch to 0 train.py:300
INFO [23:02:04] Setting total batch idx to 0 train.py:316
INFO [23:02:04] Setting global step to 0 train.py:306
Epoch 0/9999 ---------------------------------------- 0/15 0:00:00 • -:--:-- 0.00it/s v_num: 0 [23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()
[23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()
[23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()
[23:02:07] WARNING [23:02:07] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This warnings.py:109 should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)()
[23:03:00] WARNING [23:03:00] C:\Users\Usr\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for warnings.py:109 all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at ..\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined]
Epoch 0/9999 ---------------------------------------- 0/15 0:00:00 • -:--:-- 0.00it/s v_num: 0`
Duplicate of #372, #556, etc
I'm trying to train a model, and everything else seems to be working, but I haven't found any way to select which of my two GPU's to use, and it seems to refuse to use either one for training.
I run Windows 10 pro, and I have a GTX3060 and a GTX3090 installed.
When I run "nvcc --version" I get this.
(base) PS C:\WINDOWS\system32> nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0
And when I run "nvidia-smi" I get this.
Wed May 10 15:53:07 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 531.14 Driver Version: 531.14 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 WDDM | 00000000:25:00.0 Off | N/A |
| 0% 38C P8 6W / 350W| 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 3060 WDDM | 00000000:26:00.0 On | N/A |
| 0% 52C P8 17W / 170W| 325MiB / 12288MiB | 3% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
But when I try to train my model, it only uses CPU/RAM, not my GPU/VRAM (from either GPU).
Am I just fucking something up? Is there actually a method to select which GPU to use that I'm just missing? I've tried updating just about everything, and nothing seems to help.