Closed MarkIzhao closed 2 years ago
G:\Vioce\MockingBird>python synthesizer_train.py tjc G:\Vioce\tjc001\SV2TTS\synthesizer Arguments: run_id: tjc syn_dir: G:\Vioce\tjc001\SV2TTS\synthesizer models_dir: synthesizer/saved_models/ save_every: 1000 backup_every: 25000 log_every: 200 force_restart: False hparams:
Checkpoint path: synthesizer\saved_models\tjc\tjc.pt Loading training data from: G:\Vioce\tjc001\SV2TTS\synthesizer\train.txt Using model: Tacotron Using device: cuda
Initialising Tacotron Model...
\Loading the json with %s {'sample_rate': 16000, 'n_fft': 800, 'num_mels': 80, 'hop_size': 200, 'win_size': 800, 'fmin': 55, 'min_level_db': -100, 'ref_level_db': 20, 'max_abs_value': 4.0, 'preemphasis': 0.97, 'preemphasize': True, 'tts_embed_dims': 512, 'tts_encoder_dims': 256, 'tts_decoder_dims': 128, 'tts_postnet_dims': 512, 'tts_encoder_K': 5, 'tts_lstm_dims': 1024, 'tts_postnet_K': 5, 'tts_num_highways': 4, 'tts_dropout': 0.5, 'tts_cleaner_names': ['basic_cleaners'], 'tts_stop_threshold': -3.4, 'tts_schedule': [[2, 0.001, 10000, 12], [2, 0.0005, 15000, 12], [2, 0.0002, 20000, 12], [2, 0.0001, 30000, 12], [2, 5e-05, 40000, 12], [2, 1e-05, 60000, 12], [2, 5e-06, 160000, 12], [2, 3e-06, 320000, 12], [2, 1e-06, 640000, 12]], 'tts_clip_grad_norm': 1.0, 'tts_eval_interval': 500, 'tts_eval_num_samples': 1, 'tts_finetune_layers': [], 'max_mel_frames': 900, 'rescale': True, 'rescaling_max': 0.9, 'synthesis_batch_size': 16, 'signal_normalization': True, 'power': 1.5, 'griffin_lim_iters': 60, 'fmax': 7600, 'allow_clipping_in_normalization': True, 'clip_mels_length': True, 'use_lws': False, 'symmetric_mels': True, 'trim_silence': True, 'speaker_embedding_size': 256, 'silence_min_duration_split': 0.4, 'utterance_min_duration': 1.6, 'use_gst': True, 'use_ser_for_gst': True} Trainable Parameters: 32.869M
Loading weights at synthesizer\saved_models\tjc\tjc.pt Tacotron weights loaded from step 219000 Using inputs from: G:\Vioce\tjc001\SV2TTS\synthesizer\train.txt G:\Vioce\tjc001\SV2TTS\synthesizer\mels G:\Vioce\tjc001\SV2TTS\synthesizer\embeds Found 872 samples +----------------+------------+---------------+------------------+ | Steps with r=2 | Batch Size | Learning Rate | Outputs/Step (r) | +----------------+------------+---------------+------------------+ | 101k Steps | 12 | 3e-06 | 2 | +----------------+------------+---------------+------------------+
Traceback (most recent call last):
File "G:\Vioce\MockingBird\synthesizer_train.py", line 37, in
Hi got in the same problem
assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors." AssertionError: If capturable=False, state_steps should not be CUDA tensors.
did you succeed to solve it?
卸载pytorch
pip uninstall torch
然后安装pytorch CUDA 11.6
可以解决Could not load symbol cublasGetSmCountTarget from cublas64_11.dll. Error code 127
但是在训练时间达到五分钟后关闭训练 重启训练怎么能不报错:AssertionError: If capturable=False, state_steps should not be CUDA tensors.还没找到解决办法
Hi got in the same problem
assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors." AssertionError: If capturable=False, state_steps should not be CUDA tensors.
did you succeed to solve it?
CPU:AMD R7 5800H GPU:RTX3060laptop WIN11 按Ctrl+C 手动结束进程会损坏模型文件,导致报错 AssertionError: If capturable=False, state_steps should not be CUDA tensors.,非个例,正在寻找解决办法
执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑
执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑
win 11, pytorch 1.9.0, cuda 11.1
停止训练 using CTRL+C, and resumed without problem
执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 win 11, pytorch 1.9.0, cuda 11.1
停止训练 using CTRL+C, and resumed without problem
我的PyTorch是1.12
请问你的python是什么版本?
我更换环境试一下
执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 win 11, pytorch 1.9.0, cuda 11.1 停止训练 using CTRL+C, and resumed without problem
我的PyTorch是1.12 请问你的python是什么版本? 我更换环境试一下
3.7.9, i remember some people mentioned they use newer version of python. I guess you can just try do downgrade the pytorch, as the readme.md mentioned:
PyTorch worked for pytorch, tested in version of 1.9.0(latest in August 2021), with GPU Tesla T4 and GTX 2060
one more thing, if u end up choosing 1.9.0, suggest u use cuda 11.1 instead of 10.2, i had problem/error/crash during training but solved after changing cuda to 11.1
执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 win 11, pytorch 1.9.0, cuda 11.1 停止训练 使用CTRL+C,并恢复没有问题
我的PyTorch是1.12 请问你的python是什么版本? 我更换环境试一下
3.7.9,我记得有些人提到他们使用较新版本的python。我想你可以尝试降级pytorch,正如 readme.md 提到的:
PyTorch适用于pytorch,在1.9.0版本(最近于2021年8月)中进行了测试,GPU Tesla T4和GTX 2060
还有一件事,如果你最终选择了1.9.0,建议你使用cuda 11.1而不是10.2,我在训练期间遇到了问题/错误/崩溃,但在将cuda更改为11.1后解决了
非常感谢,python3.9.13 pytorch1.9 cuda11.1 Ctrl+C停止训练后确实可以继续训练 没有报错 但是step从1.1/S降低到0.8/S
我会继续尝试更换版本
执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 执行训练python synthesizer_train.py mandarin
/SV2TTS/synthesizer 停止训练CTRL+C或CTRL+fn+B 然后再次开始训练python synthesizer_train.py mandarin /SV2TTS/synthesizer 报错AssertionError: If capturable=False, state_steps should not be CUDA tensors 这个项目不支持像Deepfacelab那样可以暂停或者多次训练的吗? 邮件 知乎 B站均尝试过联系作者 无果 希望作者能早日看到并答疑 win 11, pytorch 1.9.0, cuda 11.1 停止训练 使用CTRL+C,并恢复没有问题
我的PyTorch是1.12 请问你的python是什么版本? 我更换环境试一下
3.7.9,我记得有些人提到他们使用较新版本的python。我想你可以尝试降级pytorch,正如 readme.md 提到的: PyTorch适用于pytorch,在1.9.0版本(最近于2021年8月)中进行了测试,GPU Tesla T4和GTX 2060 还有一件事,如果你最终选择了1.9.0,建议你使用cuda 11.1而不是10.2,我在训练期间遇到了问题/错误/崩溃,但在将cuda更改为11.1后解决了
非常感谢,python3.9.13 pytorch1.9 cuda11.1 Ctrl+C停止训练后确实可以继续训练 没有报错 但是step从1.1/S降低到0.8/S 我会继续尝试更换版本 WIN 11 CPU:R7 5800H GPU:3060laptop Python3.9.13 torch-1.10.2+cu113-cp39-cp39-win_amd64 无报错 正常继续训练 Steps 1/S
该问题确认解决
有人这么说的:
Hi, I am also facing the same issue when I try to load the checkpoint and resume model training on the latest pytorch (1.12).
It seems to be related with a newly introduced parameter (capturable) for the Adam and AdamW optimizers. Currently two workarounds:
- forcing capturable = True after loading the checkpoint (as suggested above)
optim.param_groups[0]['capturable'] = True
. This seems to slow down the model training by approx. 10% (YMMV depending on the setup).- Reverting pytorch back to previous versions (I have been using 1.11.0).
I'm wondering whether enforcing
capturable = True
may incur unwanted side effects.
我也担心 captureable=True
是否会带来不必要的副作用,所以我也准备回退到torch1.11.
Win11 GPU:3060laptop
Python 3.9.13
+----------------+------------+---------------+------------------+ | Steps with r=2 | Batch Size | Learning Rate | Outputs/Step (r) | +----------------+------------+---------------+------------------+ | 101k Steps | 16 | 3e-06 | 2 | +----------------+------------+---------------+------------------+
Could not load symbol cublasGetSmCountTarget from cublas64_11.dll. Error code 127 Traceback (most recent call last): File "G:\AIvioce\MockingBird\synthesizer_train.py", line 37, in
train(vars(args))
File "G:\AIvioce\MockingBird\synthesizer\train.py", line 216, in train
optimizer.step()
File "C:\Users\Mark\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\optim\optimizer.py", line 109, in wrapper
return func(*args, *kwargs)
File "C:\Users\Mark\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(args, kwargs)
File "C:\Users\Mark\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\optim\adam.py", line 157, in step
adam(params_with_grad,
File "C:\Users\Mark\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\optim\adam.py", line 213, in adam
func(params,
File "C:\Users\Mark\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\optim\adam.py", line 255, in _single_tensor_adam
assert not step_t.is_cuda, "If capturable=False, state_steps should not be CUDA tensors."
AssertionError: If capturable=False, state_steps should not be CUDA tensors.