w-okada / voice-changer

リアルタイムボイスチェンジャー Realtime Voice Changer
Other
16.4k stars 1.8k forks source link

VC PROCESSING!!!! EXCEPTION!!! CUDA error: device-side assert triggered #226

Closed AiIdol closed 1 year ago

AiIdol commented 1 year ago

D:\sovits\MMVCServerSIO>MMVCServerSIO.exe -p 18888 --https false --content_vec_500 pretrain/checkpoint_best_legacy_500.pt --content_vec_500_onnx pretrain/checkpoint_best_legacy_500.onnx --content_vec_500_onnx_on false --hubert_base pretrain/hubert_base.pt --hubert_base_jp pretrain/rinna_hubert_base_jp.pt --hubert_soft pretrain/hubert/hubert-soft-0d54a1f4.pt --nsf_hifigan pretrain/nsf_hifigan/model Booting PHASE :main Voice Changerを起動しています。 Internal_Port:18888 protocol: HTTP


ブラウザで次のURLを開いてください.
http://<IP>:<PORT>/
多くの場合は次のいずれかのURLにアクセスすると起動します。
http://localhost:18888/

did-fail-load (node:9596) electron: Failed to load URL: http://localhost:18888/ with error: ERR_CONNECTION_REFUSED (Use voice-changer-native-client --trace-warnings ... to show where the warning was created) Booting PHASE :main Booting PHASE :MMVCServerSIO input: [{'name': 'Microsoft 声音映射器 - Input', 'index': 0, 'hostapi': 0, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}, {'name': '麦克风阵列 (Realtek(R) Audio)', 'index': 1, 'hostapi': 0, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}, {'name': '主声音捕获驱动程序', 'index': 4, 'hostapi': 1, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.12, 'default_low_output_latency': 0.0, 'default_high_input_latency': 0.24, 'default_high_output_latency': 0.0, 'default_samplerate': 44100.0}, {'name': '麦克风阵列 (Realtek(R) Audio)', 'index': 5, 'hostapi': 1, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.12, 'default_low_output_latency': 0.0, 'default_high_input_latency': 0.24, 'default_high_output_latency': 0.0, 'default_samplerate': 44100.0}, {'name': '麦克风阵列 (Realtek(R) Audio)', 'index': 9, 'hostapi': 3, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.003, 'default_low_output_latency': 0.0, 'default_high_input_latency': 0.01, 'default_high_output_latency': 0.0, 'default_samplerate': 48000.0}, {'name': '立体声混音 (Realtek HD Audio Stereo input)', 'index': 12, 'hostapi': 4, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 48000.0}, {'name': '麦克风阵列 (Realtek HD Audio Mic input)', 'index': 13, 'hostapi': 4, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 44100.0}, {'name': '电脑扬声器 (Realtek HD Audio output with SST)', 'index': 16, 'hostapi': 4, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 48000.0}, {'name': '电脑扬声器 (Realtek HD Audio 2nd output with SST)', 'index': 19, 'hostapi': 4, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 48000.0}, {'name': 'Mic in at front panel (black) (Mic in at front panel (black))', 'index': 20, 'hostapi': 4, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 44100.0}] output: [{'name': 'Microsoft 声音映射器 - Output', 'index': 2, 'hostapi': 0, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}, {'name': '扬声器 (Realtek(R) Audio)', 'index': 3, 'hostapi': 0, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}, {'name': '主声音驱动程序', 'index': 6, 'hostapi': 1, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.0, 'default_low_output_latency': 0.12, 'default_high_input_latency': 0.0, 'default_high_output_latency': 0.24, 'default_samplerate': 44100.0}, {'name': '扬声器 (Realtek(R) Audio)', 'index': 7, 'hostapi': 1, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.0, 'default_low_output_latency': 0.12, 'default_high_input_latency': 0.0, 'default_high_output_latency': 0.24, 'default_samplerate': 44100.0}, {'name': '扬声器 (Realtek(R) Audio)', 'index': 8, 'hostapi': 3, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.0, 'default_low_output_latency': 0.003, 'default_high_input_latency': 0.0, 'default_high_output_latency': 0.01, 'default_samplerate': 48000.0}, {'name': 'Output (NVIDIA High Definition Audio)', 'index': 10, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 44100.0}, {'name': 'Speakers (Nahimic mirroring Wave Speaker)', 'index': 11, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 44100.0}, {'name': 'Speakers 1 (Realtek HD Audio output with SST)', 'index': 14, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 48000.0}, {'name': 'Speakers 2 (Realtek HD Audio output with SST)', 'index': 15, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 44100.0}, {'name': 'Headphones 1 (Realtek HD Audio 2nd output with SST)', 'index': 17, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 48000.0}, {'name': 'Headphones 2 (Realtek HD Audio 2nd output with SST)', 'index': 18, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 2, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 44100.0}, {'name': 'Speakers (Nahimic Easy Surround)', 'index': 21, 'hostapi': 4, 'max_input_channels': 0, 'max_output_channels': 8, 'default_low_input_latency': 0.01, 'default_low_output_latency': 0.01, 'default_high_input_latency': 0.04, 'default_high_output_latency': 0.04, 'default_samplerate': 48000.0}] hostapis ({'name': 'MME', 'devices': [0, 1, 2, 3], 'default_input_device': 1, 'default_output_device': 3}, {'name': 'Windows DirectSound', 'devices': [4, 5, 6, 7], 'default_input_device': 4, 'default_output_device': 6}, {'name': 'ASIO', 'devices': [], 'default_input_device': -1, 'default_output_device': -1}, {'name': 'Windows WASAPI', 'devices': [8, 9], 'default_input_device': 9, 'default_output_device': 8}, {'name': 'Windows WDM-KS', 'devices': [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], 'default_input_device': 13, 'default_output_device': 14}) VoiceChanger Initialized (GPU_NUM:1, mps_enabled:False) [2023-05-13 22:24:52] connet sid : cdV-c9YjZxri8878AAAB [2023-05-13 22:24:52] connet sid : _mTo-BAvkWkFCiVuAAAD [2023-05-13 22:24:56] connet sid : NGmXkIuTLs2EOgmVAAAF [2023-05-13 22:24:56] connet sid : mP4NsgdandzpThK6AAAH switch model type 1 DEBUG:matplotlib:matplotlib data path: D:\sovits\MMVCServerSIO\matplotlib\mpl-data DEBUG:matplotlib:CONFIGDIR=C:\Users\Shinomiya\AppData\Local\Temp\tmpt5yc4_se DEBUG:matplotlib:interactive is False DEBUG:matplotlib:platform is win32 DEBUG:matplotlib:CACHEDIR=C:\Users\Shinomiya\AppData\Local\Temp\tmpt5yc4_se DEBUG:matplotlib.font_manager:font search path [WindowsPath('D:/sovits/MMVCServerSIO/matplotlib/mpl-data/fonts/ttf'), WindowsPath('D:/sovits/MMVCServerSIO/matplotlib/mpl-data/fonts/afm'), WindowsPath('D:/sovits/MMVCServerSIO/matplotlib/mpl-data/fonts/pdfcorefonts')] INFO:matplotlib.font_manager:generated new fontManager so-vits-svc40 initialization: VoiceChangerParams(content_vec_500='pretrain/checkpoint_best_legacy_500.pt', content_vec_500_onnx='pretrain/checkpoint_best_legacy_500.onnx', content_vec_500_onnx_on=0, hubert_base='pretrain/hubert_base.pt', hubert_base_jp='pretrain/rinna_hubert_base_jp.pt', hubert_soft='pretrain/hubert/hubert-soft-0d54a1f4.pt', nsf_hifigan='pretrain/nsf_hifigan/model') [Voice Changer] update configuration: dstId 1 [Voice Changer] update configuration: crossFadeEndRate 1 [Voice Changer] update configuration: crossFadeOverlapSize 1024 [Voice Changer] update configuration: framework PyTorch [Voice Changer] update configuration: onnxExecutionProvider CPUExecutionProvider [Voice Changer] update configuration: f0Detector dio onnxExecutionProvider is not mutable variable or unknown variable! [Voice Changer] update configuration: f0Factor 1 [Voice Changer] update configuration: serverInputAudioSampleRate 48000 f0Factor is not mutable variable or unknown variable! [Voice Changer] update configuration: serverOutputAudioSampleRate 48000 [Voice Changer] update configuration: serverInputAudioBufferSize 24576 [Voice Changer] update configuration: serverInputDeviceId -1 [Voice Changer] update configuration: serverOutputAudioBufferSize 24576 [Voice Changer] update configuration: serverOutputDeviceId -1 [Voice Changer] update configuration: serverReadChunkSize 256 [Voice Changer] update configuration: noiseScale 0.3 [Voice Changer] update configuration: tran 10 [Voice Changer] update configuration: extraConvertSize 32768 [Voice Changer] update configuration: modelSamplingRate 48000 [Voice Changer] update configuration: clusterInferRatio 0.1 modelSamplingRate is not mutable variable or unknown variable! [Voice Changer] update configuration: silenceFront 1 [Voice Changer] update configuration: useDiff 1 silenceFront is not mutable variable or unknown variable! useDiff is not mutable variable or unknown variable! [Voice Changer] update configuration: diffAcc 20 diffAcc is not mutable variable or unknown variable! [Voice Changer] update configuration: diffSpkId 1 diffSpkId is not mutable variable or unknown variable! [Voice Changer] update configuration: kStep 120 kStep is not mutable variable or unknown variable! [Voice Changer] update configuration: threshold -45 threshold is not mutable variable or unknown variable! [Voice Changer] update configuration: inputSampleRate 48000 [2023-05-13 22:24:59] connet sid : Mt9MlH6A1Y3MqEH9AAAJ [2023-05-13 22:25:00] connet sid : SBgKqwENZ5-_JazGAAAL switch model type 1 remove modules D:\sovits\MMVCServerSIO\so-vits-svc-40\modules__init.py remove modules.commons D:\sovits\MMVCServerSIO\so-vits-svc-40\modules\commons.py remove modules.modules D:\sovits\MMVCServerSIO\so-vits-svc-40\modules\modules.py remove modules.attentions D:\sovits\MMVCServerSIO\so-vits-svc-40\modules\attentions.py remove hubert D:\sovits\MMVCServerSIO\so-vits-svc-40\hubert\init.py remove hubert.hubert_model D:\sovits\MMVCServerSIO\so-vits-svc-40\hubert\hubert_model.py remove utils D:\sovits\MMVCServerSIO\so-vits-svc-40\utils.py remove vdecoder D:\sovits\MMVCServerSIO\so-vits-svc-40\vdecoder\init.py remove vdecoder.hifigan.env D:\sovits\MMVCServerSIO\so-vits-svc-40\vdecoder\hifigan\env.py remove vdecoder.hifigan.utils D:\sovits\MMVCServerSIO\so-vits-svc-40\vdecoder\hifigan\utils.py remove vdecoder.hifigan.models D:\sovits\MMVCServerSIO\so-vits-svc-40\vdecoder\hifigan\models.py remove models D:\sovits\MMVCServerSIO\so-vits-svc-40\models.py remove cluster D:\sovits\MMVCServerSIO\so-vits-svc-40\cluster\init__.py so-vits-svc40 initialization: VoiceChangerParams(content_vec_500='pretrain/checkpoint_best_legacy_500.pt', content_vec_500_onnx='pretrain/checkpoint_best_legacy_500.onnx', content_vec_500_onnx_on=0, hubert_base='pretrain/hubert_base.pt', hubert_base_jp='pretrain/rinna_hubert_base_jp.pt', hubert_soft='pretrain/hubert/hubert-soft-0d54a1f4.pt', nsf_hifigan='pretrain/nsf_hifigan/model') [Voice Changer] update configuration: dstId 1 [Voice Changer] update configuration: framework PyTorch [Voice Changer] update configuration: crossFadeEndRate 1 [Voice Changer] update configuration: crossFadeOverlapSize 1024 [Voice Changer] update configuration: onnxExecutionProvider CPUExecutionProvider onnxExecutionProvider is not mutable variable or unknown variable! [Voice Changer] update configuration: f0Factor 1 f0Factor is not mutable variable or unknown variable! [Voice Changer] update configuration: serverInputAudioSampleRate 48000 [Voice Changer] update configuration: f0Detector dio [Voice Changer] update configuration: serverOutputAudioSampleRate 48000 [Voice Changer] update configuration: serverInputAudioBufferSize 24576 [Voice Changer] update configuration: serverOutputAudioBufferSize 24576 [Voice Changer] update configuration: serverInputDeviceId -1 [Voice Changer] update configuration: serverOutputDeviceId -1 [Voice Changer] update configuration: serverReadChunkSize 256 [Voice Changer] update configuration: tran 10 [Voice Changer] update configuration: extraConvertSize 32768 [Voice Changer] update configuration: noiseScale 0.3 [Voice Changer] update configuration: clusterInferRatio 0.1 [Voice Changer] update configuration: modelSamplingRate 48000 modelSamplingRate is not mutable variable or unknown variable! [Voice Changer] update configuration: useDiff 1 [Voice Changer] update configuration: silenceFront 1 [Voice Changer] update configuration: diffAcc 20 silenceFront is not mutable variable or unknown variable! [Voice Changer] update configuration: diffSpkId 1 useDiff is not mutable variable or unknown variable! [Voice Changer] update configuration: kStep 120 diffAcc is not mutable variable or unknown variable! [Voice Changer] update configuration: threshold -45 diffSpkId is not mutable variable or unknown variable! threshold is not mutable variable or unknown variable! kStep is not mutable variable or unknown variable! [Voice Changer] update configuration: inputSampleRate 48000 paramDict {'trans': 0, 'files': {'mmvcv13Config': '', 'mmvcv13Model': '', 'mmvcv15Config': '', 'mmvcv15Model': '', 'soVitsSvc40Config': 'config.json', 'soVitsSvc40Model': 'AzumaSeren-v2.pth', 'soVitsSvc40Cluster': 'kmeans_10000.pt', 'soVitsSvc40v2Config': '', 'soVitsSvc40v2Model': '', 'soVitsSvc40v2Cluster': '', 'rvcModel': '', 'rvcIndex': '', 'rvcFeature': '', 'ddspSvcModel': '', 'ddspSvcModelConfig': '', 'ddspSvcDiffusion': '', 'ddspSvcDiffusionConfig': ''}} load INFO:root:Loaded checkpoint 'C:\Users\SHINOM~1\AppData\Local\Temp\tmp_xxt_azx\upload_dir\0\AzumaSeren-v2.pth' (iteration 1322) Generated Strengths: for prev:(1024,), for cur:(1024,) !!! !!! !!! wav size not multiple of hopsize: 29.3984375 VC PROCESSING!!!! EXCEPTION!!! CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Traceback (most recent call last): File "voice_changer\VoiceChanger.py", line 433, in on_request_sola File "voice_changer\SoVitsSvc40\SoVitsSvc40.py", line 451, in inference File "voice_changer\SoVitsSvc40\SoVitsSvc40.py", line 427, in _pyTorch_inference File "D:\sovits\MMVCServerSIO\so-vits-svc-40\models.py", line 417, in infer z_p, m_p, logs_p, c_mask = self.enc_p(x, x_mask, f0=f0_to_coarse(f0), noice_scale=noice_scale) File "D:\sovits\MMVCServerSIO\so-vits-svc-40\utils.py", line 173, in f0_to_coarse f0_mel[f0_mel > 0] = (f0_mel[f0_mel > 0] - f0_mel_min) * (f0_bin - 2) / (f0_mel_max - f0_mel_min) + 1 RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

w-okada commented 1 year ago

So?

AiIdol commented 1 year ago

how to fix it

w-okada commented 1 year ago

Which version do you use (download)? What kind of GPU do you use? Are you using so-vits-svc4.0 model? not 4.0v2.

AiIdol commented 1 year ago

version:v.1.5.2.9a cpu:intel i5 12700H model:sovits4.0

i can use it in old version,but when i update the newest version,i can not use it

gengzhichen commented 1 year ago

same issue occured VC PROCESSING!!!! EXCEPTION!!! CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Traceback (most recent call last): File "voice_changer\VoiceChanger.py", line 425, in on_request_sola File "voice_changer\SoVitsSvc40\SoVitsSvc40.py", line 358, in generate_input File "voice_changer\SoVitsSvc40\SoVitsSvc40.py", line 279, in get_unit_f0 RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

gengzhichen commented 1 year ago

I'm using 4090 so-vits-svc-40 not v2

w-okada commented 1 year ago

In my environment, I am unable to replicate the issue. Please try the latest version.

w-okada commented 1 year ago

no res close