effusiveperiscope / so-vits-svc

so-vits-svc
MIT License
179 stars 72 forks source link

Notebook version inexplicably not running #24

Closed mya2152 closed 1 year ago

mya2152 commented 1 year ago

Running on Paperspace gradient workbooks, CUDA 11.6, all prerequisites succesfully installed and all files present in folders, samples present yet at the final generation step it keeps showing this (weird thing is, it was working just fine last night, not sure what happened):

./logs/44k/G_0.pth INFO:44k:{'train': {'log_interval': 200, 'eval_interval': 800, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 10240, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': True, 'max_speclen': 512, 'port': '8001', 'keep_ckpts': 3}, 'data': {'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 44100, 'filter_length': 2048, 'hop_length': 512, 'win_length': 2048, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': 22050}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256, 'ssl_dim': 256, 'n_speakers': 200}, 'spk': {'aimodel': 0}, 'model_dir': './logs/44k', 'reset': False} DEBUG:tensorflow:Falling back to TensorFlow client; we recommended you install the Cloud TPU client directly with pip install cloud-tpu-client. DEBUG:h5py._conv:Creating converter from 7 to 5 DEBUG:h5py._conv:Creating converter from 5 to 7 DEBUG:h5py._conv:Creating converter from 7 to 5 DEBUG:h5py._conv:Creating converter from 5 to 7 DEBUG:jaxlib.mlir._mlir_libs:Initializing MLIR with module: _site_initialize_0 DEBUG:jaxlib.mlir._mlir_libs:Registering dialects from initializer <module 'jaxlib.mlir._mlir_libs._site_initialize_0' from '/usr/local/lib/python3.9/dist-packages/jaxlib/mlir/_mlir_libs/_site_initialize_0.so'> DEBUG:jax._src.path:etils.epath found. Using etils.epath for file I/O. INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. ./logs/44k/G_0.pth error, emb_g.weight is not in the checkpoint INFO:44k:emb_g.weight is not in the checkpoint INFO:44k:Loaded checkpoint './logs/44k/G_0.pth' (iteration 0) ./logs/44k/D_0.pth INFO:44k:Loaded checkpoint './logs/44k/D_0.pth' (iteration 0) Process SpawnProcess-1: Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/notebooks/so-vits-svc/train.py", line 135, in run train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler, File "/notebooks/so-vits-svc/train.py", line 157, in train_and_evaluate for batch_idx, items in enumerate(train_loader): File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 681, in next data = self._next_data() File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 1376, in _next_data return self._process_data(data) File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 1402, in _process_data data.reraise() File "/usr/local/lib/python3.9/dist-packages/torch/_utils.py", line 461, in reraise raise exception AttributeError: Caught AttributeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/notebooks/so-vits-svc/data_utils.py", line 88, in getitem return self.get_audio(self.audiopaths[index][0]) File "/notebooks/so-vits-svc/data_utils.py", line 60, in get_audio spk = torch.LongTensor([self.spk_map[spk]]) File "/notebooks/so-vits-svc/sovits_utils.py", line 582, in getitem return getattr(self, key) AttributeError: 'HParams' object has no attribute 'johnvocals1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/multiprocessing/spawn.py", line 76, in _wrap sys.exit(1) SystemExit: 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.9/multiprocessing/process.py", line 318, in _bootstrap util._exit_function() File "/usr/lib/python3.9/multiprocessing/util.py", line 357, in _exit_function p.join() File "/usr/lib/python3.9/multiprocessing/process.py", line 149, in join res = self._popen.wait(timeout) File "/usr/lib/python3.9/multiprocessing/popen_fork.py", line 43, in wait return self.poll(os.WNOHANG if timeout == 0.0 else 0) File "/usr/lib/python3.9/multiprocessing/popen_fork.py", line 27, in poll pid, sts = os.waitpid(self.pid, flag) File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 482) is killed by signal: Terminated. Traceback (most recent call last): File "/notebooks/so-vits-svc/train.py", line 326, in main() File "/notebooks/so-vits-svc/train.py", line 62, in main mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,)) File "/usr/local/lib/python3.9/dist-packages/torch/multiprocessing/spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/usr/local/lib/python3.9/dist-packages/torch/multiprocessing/spawn.py", line 198, in start_processes while not context.join(): File "/usr/local/lib/python3.9/dist-packages/torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/notebooks/so-vits-svc/train.py", line 135, in run train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler, File "/notebooks/so-vits-svc/train.py", line 157, in train_and_evaluate for batch_idx, items in enumerate(train_loader): File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 681, in next data = self._next_data() File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 1376, in _next_data return self._process_data(data) File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/dataloader.py", line 1402, in _process_data data.reraise() File "/usr/local/lib/python3.9/dist-packages/torch/_utils.py", line 461, in reraise raise exception AttributeError: Caught AttributeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.9/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/notebooks/so-vits-svc/data_utils.py", line 88, in getitem return self.get_audio(self.audiopaths[index][0]) File "/notebooks/so-vits-svc/data_utils.py", line 60, in get_audio spk = torch.LongTensor([self.spk_map[spk]]) File "/notebooks/so-vits-svc/sovits_utils.py", line 582, in getitem return getattr(self, key) AttributeError: 'HParams' object has no attribute 'johnvocals1'

effusiveperiscope commented 1 year ago

You appear to be loading a checkpoint labeled with zero steps (G_0.pth)?