大佬求解，开始训练输完训练代码出来的报错

zypjeff commented 1 year ago

RuntimeError: expand(torch.FloatTensor{[2, 1025, 475]}, size=[2, 1025]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (3)

这个怎么处理？然后我修正了参数再次训练，又显示这个报错

G:\Bert-VITS2-Integration-Package>%PYTHON% train_ms.py -c ./configs\config.json INFO:OUTPUT_MODEL:{'train': {'log_interval': 10, 'eval_interval': 100, 'seed': 52, 'epochs': 1000, 'learning_rate': 0.0002, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': False, 'lr_decay': 0.999875, 'segment_size': 8384, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0}, 'data': {'use_mel_posterior_encoder': False, 'training_files': 'filelists/train.list', 'validation_files': 'filelists/val.list', 'max_wav_value': 32768.0, 'sampling_rate': 44100, 'filter_length': 2048, 'hop_length': 512, 'win_length': 2048, 'n_mel_channels': 128, 'mel_fmin': 0.0, 'mel_fmax': None, 'add_blank': True, 'n_speakers': 1, 'cleaned_text': True, 'spk2id': {'jeff': 0}}, 'model': {'use_spk_conditioned_encoder': True, 'use_noise_scaled_mas': True, 'use_mel_posterior_encoder': False, 'use_duration_discriminator': True, 'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 8, 2, 2], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256}, 'model_dir': './logs\./OUTPUT_MODEL', 'cont': False} WARNING:OUTPUT_MODEL:G:\Bert-VITS2-Integration-Package is not a git repository, therefore hash value comparison will be ignored. INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. skipped: 7 , total: 852 skipped: 0 , total: 4 Using noise scaled MAS for VITS2 Using duration discriminator for VITS2 256 2 256 2 256 2 256 2 256 2 ./logs./OUTPUT_MODEL\DUR_0.pth error, norm_1.gamma is not in the checkpoint error, norm_1.beta is not in the checkpoint error, norm_2.gamma is not in the checkpoint error, norm_2.beta is not in the checkpoint error, cond.weight is not in the checkpoint error, cond.bias is not in the checkpoint load INFO:OUTPUT_MODEL:Loaded checkpoint './logs./OUTPUT_MODEL\DUR_0.pth' (iteration 694) ./logs./OUTPUT_MODEL\G_0.pth error, emb_g.weight is not in the checkpoint load INFO:OUTPUT_MODEL:Loaded checkpoint './logs./OUTPUT_MODEL\G_0.pth' (iteration 0) ./logs./OUTPUT_MODEL\D_0.pth load INFO:OUTPUT_MODEL:Loaded checkpoint './logs./OUTPUT_MODEL\D_0.pth' (iteration 0) 0it [00:00, ?it/s]G:\Bert-VITS2-Integration-Package\mel_processing.py:78: FutureWarning: Pass sr=44100, n_fft=2048, n_mels=128, fmin=0.0, fmax=None as keyword args. From version 0.10 passing these as positional arguments will result in an error mel = librosa_mel_fn(sampling_rate, n_fft, num_mels, fmin, fmax) [W C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator ()) 0it [00:04, ?it/s] Traceback (most recent call last): File "train_ms.py", line 402, in main() File "train_ms.py", line 60, in main mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,)) File "G:\Bert-VITS2-Integration-Package\venv\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "G:\Bert-VITS2-Integration-Package\venv\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes while not context.join(): File "G:\Bert-VITS2-Integration-Package\venv\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "G:\Bert-VITS2-Integration-Package\venv\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap fn(i, *args) File "G:\Bert-VITS2-Integration-Package\train_ms.py", line 193, in run train_and_evaluate(rank, epoch, hps, [net_g, net_d, net_dur_disc], [optim_g, optim_d, optim_dur_disc], [scheduler_g, scheduler_d, scheduler_dur_disc], scaler, [train_loader, eval_loader], logger, [writer, writer_eval]) File "G:\Bert-VITS2-Integration-Package\train_ms.py", line 286, in train_and_evaluate loss_fm = feature_loss(fmap_r, fmap_g) File "G:\Bert-VITS2-Integration-Package\losses.py", line 13, in feature_loss loss += torch.mean(torch.abs(rl - gl)) RuntimeError: The size of tensor a (8384) must match the size of tensor b (8192) at non-singleton dimension 2

YYuX-1145 commented 1 year ago

数据集有没有问题？重采样了吗？

zypjeff commented 1 year ago

数据集有没有问题？重采样了吗？

采样了好几次。还是这个提示

zypjeff commented 1 year ago

数据集有没有问题？重采样了吗？

我改了一下config里面的参数 "eps": 1e-09, "batch_size": 6, "fp16_run": false, "lr_decay": 0.999875, "segment_size": 8192, "init_lr_ratio": 1, "warmup_epochs": 0, "c_mel": 45, "c_kl": 1.0 }, 把 "segment_size": 8192, 这个调成一样的数值，好像就开始训练了。哪位能指点一下，这两个size不一致的原因？

smartLanny commented 1 year ago

同样的问题，请问怎么修复的？ RuntimeError: expand(torch.FloatTensor{[2, 1025, 461]}, size=[2, 1025]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (3)

YYuX-1145 / Bert-VITS2-Integration-package

大佬求解，开始训练输完训练代码出来的报错 #32

YYuX-1145 / Bert-VITS2-Integration-package

大佬求解，开始训练 输完训练代码出来的报错 #32

大佬求解，开始训练输完训练代码出来的报错 #32