w-okada / voice-changer

リアルタイムボイスチェンジャー Realtime Voice Changer
Other
16.14k stars 1.76k forks source link

[ISSUE]: Only emits audio when passthru enabled #1074

Open gelidoum opened 8 months ago

gelidoum commented 8 months ago

Voice Changer Version

MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.17b

Operational System

W11

GPU

RTX 3050 laptop 4gb

Read carefully and check the options

Model Type

RVC

Issue Description

It only emits audio when passthru is enabled and I can only hear myself, voice doesn't convert. already watched a ton of yt tutorials and none worked for me.

Application Screenshot

image

Logs on console

Booting PHASE :__main__
PYTHON:3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Activating the Voice Changer.

[Voice Changer] download sample catalog. samples_0004_t.json [Voice Changer] download sample catalog. samples_0004_o.json [Voice Changer] download sample catalog. samples_0004_d.json [Voice Changer] model_dir is already exists. skip download samples. Internal_Port:18888 protocol: HTTP


Please open the following URL in your browser.
http://<IP>:<PORT>/
In many cases, it will launch when you access any of the following URLs.
http://127.0.0.1:18888/

[VCClient] Access http://127.0.0.1:18888/ [VCClient] wait web server...0 http://127.0.0.1:18888/ [Voice Changer] generate new embedder. (no embedder) [Voice Changer] use torch contentvec [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed. [Voice Changer] exception! loading embedder PytorchStreamReader failed reading zip archive: failed finding central directory cuda:0 Traceback (most recent call last): File "voice_changer\RVC\embedder\EmbedderManager.py", line 47, in loadEmbedder File "voice_changer\RVC\embedder\OnnxContentvec.py", line 17, in loadModel File "onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in init File "onnxruntime\capi\onnxruntime_inference_collection.py", line 384, in _create_inference_session onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "voice_changer\RVC\pipeline\PipelineGenerator.py", line 30, in createPipeline File "voice_changer\RVC\embedder\EmbedderManager.py", line 26, in getEmbedder File "voice_changer\RVC\embedder\EmbedderManager.py", line 51, in loadEmbedder File "voice_changer\RVC\embedder\FairseqHubert.py", line 11, in loadModel File "fairseq\checkpoint_utils.py", line 425, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "fairseq\checkpoint_utils.py", line 315, in load_checkpoint_to_cpu state = torch.load(f, map_location=torch.device("cpu")) File "torch\serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "torch\serialization.py", line 283, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory [VCClient] wait web server... done 200 [2024-01-10 23:56:26] connet sid : QSBDKGN5b3nS-huYAAAC [2024-01-10 23:56:26] connet sid : 2lxQI4l65H9kECI4AAAD [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: recordIO 1 -------------------------- - - - 48000, 48000 [Voice Changer] update configuration: recordIO 0 paramDict {'voiceChangerType': 'RVC', 'slot': 4, 'isSampleMode': False, 'sampleId': None, 'files': [{'name': 'Fernanfloo.pth', 'kind': 'rvcModel', 'dir': ''}, {'name': 'added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', 'kind': 'rvcIndex', 'dir': ''}], 'params': {}} RVC:: slotInfo.modelFile Fernanfloo.pth [Voice Changer] Official Model(pyTorch) : v2 SlotInfo::: RVCModelSlot(slotIndex=-1, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={0: 'target'}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=0, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') [Voice Changer] update configuration: modelSlotIndex 1704953439004 gin_channels: 256 self.spk_embed_dim: 109 [Voice Changer] generate new embedder. (no embedder) [Voice Changer] use torch contentvec [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed. [Voice Changer] exception! loading embedder PytorchStreamReader failed reading zip archive: failed finding central directory cuda:0 Traceback (most recent call last): File "voice_changer\RVC\embedder\EmbedderManager.py", line 47, in loadEmbedder File "voice_changer\RVC\embedder\OnnxContentvec.py", line 17, in loadModel File "onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in init File "onnxruntime\capi\onnxruntime_inference_collection.py", line 384, in _create_inference_session onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "voice_changer\RVC\pipeline\PipelineGenerator.py", line 30, in createPipeline File "voice_changer\RVC\embedder\EmbedderManager.py", line 26, in getEmbedder File "voice_changer\RVC\embedder\EmbedderManager.py", line 51, in loadEmbedder File "voice_changer\RVC\embedder\FairseqHubert.py", line 11, in loadModel File "fairseq\checkpoint_utils.py", line 425, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "fairseq\checkpoint_utils.py", line 315, in load_checkpoint_to_cpu state = torch.load(f, map_location=torch.device("cpu")) File "torch\serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "torch\serialization.py", line 283, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory [Voice Changer] update configuration: recordIO 1 -------------------------- - - - 48000, 48000 [Voice Changer] update configuration: recordIO 0 [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: silentThreshold 0.00009 [Voice Changer] update configuration: silentThreshold 0.00008 [Voice Changer] update configuration: silentThreshold 0.00008 [Voice Changer] update configuration: silentThreshold 0.00006 [Voice Changer] update configuration: silentThreshold 0.00005 [Voice Changer] update configuration: silentThreshold 0.00003 [Voice Changer] update configuration: silentThreshold 0.00002 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: silentThreshold 0 [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false SlotInfo::: RVCModelSlot(slotIndex=4, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={'0': 'target'}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=12, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') SlotInfo::: RVCModelSlot(slotIndex=4, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={'0': 'target'}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=12, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') SlotInfo::: RVCModelSlot(slotIndex=4, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={'0': 'target'}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=12, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') SlotInfo::: RVCModelSlot(slotIndex=4, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=12, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') SlotInfo::: RVCModelSlot(slotIndex=4, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={'0': ''}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=12, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') SlotInfo::: RVCModelSlot(slotIndex=4, voiceChangerType='RVC', name='Fernanfloo', description='', credit='', termsOfUseUrl='', iconFile='', speakers={}, modelFile='Fernanfloo.pth', indexFile='added_IVF315_Flat_nprobe_1_Fernanfloo_v2.index', defaultTune=12, defaultIndexRatio=0, defaultProtect=0.5, isONNX=False, modelType='pyTorchRVCv2', samplingRate=40000, f0=True, embChannels=768, embOutputLayer=12, useFinalProj=False, deprecated=False, embedder='hubert_base', sampleId='', version='v2') [Voice Changer] update configuration: modelSlotIndex 1704954180000 [Voice Changer] generate new embedder. (no embedder) [Voice Changer] use torch contentvec [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed. [Voice Changer] exception! loading embedder PytorchStreamReader failed reading zip archive: failed finding central directory cuda:0 Traceback (most recent call last): File "voice_changer\RVC\embedder\EmbedderManager.py", line 47, in loadEmbedder File "voice_changer\RVC\embedder\OnnxContentvec.py", line 17, in loadModel File "onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in init File "onnxruntime\capi\onnxruntime_inference_collection.py", line 384, in _create_inference_session onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "voice_changer\RVC\pipeline\PipelineGenerator.py", line 30, in createPipeline File "voice_changer\RVC\embedder\EmbedderManager.py", line 26, in getEmbedder File "voice_changer\RVC\embedder\EmbedderManager.py", line 51, in loadEmbedder File "voice_changer\RVC\embedder\FairseqHubert.py", line 11, in loadModel File "fairseq\checkpoint_utils.py", line 425, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "fairseq\checkpoint_utils.py", line 315, in load_checkpoint_to_cpu state = torch.load(f, map_location=torch.device("cpu")) File "torch\serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "torch\serialization.py", line 283, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: enableServerAudio 1 [Voice Changer] update configuration: serverOutputDeviceId 11 [Voice Changer] update configuration: serverAudioStated 1 Devices: [Input]: ServerAudioDevice(kind='audioinput', index=1, name='Microphone (Razer Seiren Mini)', hostAPI='MME', maxInputChannels=2, maxOutputChannels=0, default_samplerate=44100.0, available_samplerates=[]) None [Output]: ServerAudioDevice(kind='audiooutput', index=11, name='CABLE Input (VB-Audio Virtual C', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Monitor]: None None Sample Rate:

[Input]: 44100 -> True [Output]: 44100 -> True [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 [Voice Changer] update configuration: serverMonitorDeviceId 8 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 serverMonitorDeviceId Changed: -1 -> 8 [Voice Changer] update configuration: passThrough false Devices: [Input]: ServerAudioDevice(kind='audioinput', index=1, name='Microphone (Razer Seiren Mini)', hostAPI='MME', maxInputChannels=2, maxOutputChannels=0, default_samplerate=44100.0, available_samplerates=[]) None [Output]: ServerAudioDevice(kind='audiooutput', index=11, name='CABLE Input (VB-Audio Virtual C', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Monitor]: ServerAudioDevice(kind='audiooutput', index=8, name='Auriculares (2- WH-XB910N)', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Voice Changer] update configuration: passThrough true Sample Rate:

[Input]: 44100 -> True [Output]: 44100 -> True [Monitor]: 44100 -> True [Voice Changer] update configuration: passThrough false [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: passThrough true [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: passThrough false [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: passThrough true [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: passThrough false [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer][ServerDevice][audioOutput_callback] ex: could not broadcast input array from shape (1024,2) into shape (82790,2) [Voice Changer] update configuration: passThrough true [Voice Changer] server audio performance [0, 0, 0, 0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: serverAudioStated 0 [Voice Changer] server audio performance [0] status: started:0, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 serverAudioStarted Changed: 0 [Voice Changer] update configuration: serverAudioStated 1 Devices: [Input]: ServerAudioDevice(kind='audioinput', index=1, name='Microphone (Razer Seiren Mini)', hostAPI='MME', maxInputChannels=2, maxOutputChannels=0, default_samplerate=44100.0, available_samplerates=[]) None [Output]: ServerAudioDevice(kind='audiooutput', index=11, name='CABLE Input (VB-Audio Virtual C', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Monitor]: ServerAudioDevice(kind='audiooutput', index=8, name='Auriculares (2- WH-XB910N)', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None Sample Rate:

[Input]: 44100 -> True [Output]: 44100 -> True [Monitor]: 44100 -> True [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: serverOutputDeviceId 8 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:8, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 serverOutputDeviceId Changed: 11 -> 8 Devices: [Input]: ServerAudioDevice(kind='audioinput', index=1, name='Microphone (Razer Seiren Mini)', hostAPI='MME', maxInputChannels=2, maxOutputChannels=0, default_samplerate=44100.0, available_samplerates=[]) None [Output]: ServerAudioDevice(kind='audiooutput', index=8, name='Auriculares (2- WH-XB910N)', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Monitor]: ServerAudioDevice(kind='audiooutput', index=8, name='Auriculares (2- WH-XB910N)', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None Sample Rate:

[Input]: 44100 -> True [Output]: 44100 -> True [Monitor]: 44100 -> True [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:8, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:8, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:8, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:8, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 [Voice Changer] update configuration: serverOutputDeviceId 11 [Voice Changer] server audio performance [0] status: started:1, model_sr:40000, chunk:704 input : id:1, sr:44100, ch:2 output : id:11, sr:44100, ch:2 monitor: id:8, sr:44100, ch:2 serverOutputDeviceId Changed: 8 -> 11 Devices: [Input]: ServerAudioDevice(kind='audioinput', index=1, name='Microphone (Razer Seiren Mini)', hostAPI='MME', maxInputChannels=2, maxOutputChannels=0, default_samplerate=44100.0, available_samplerates=[]) None [Output]: ServerAudioDevice(kind='audiooutput', index=11, name='CABLE Input (VB-Audio Virtual C', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Monitor]: ServerAudioDevice(kind='audiooutput', index=8, name='Auriculares (2- WH-XB910N)', hostAPI='MME', maxInputChannels=0, maxOutputChannels=2, default_samplerate=44100.0, available_samplerates=[]) None [Voice Changer] update configuration: serverAudioStated 0 [Voice Changer] update configuration: enableServerAudio 0 Sample Rate:

[Input]: 44100 -> True [Output]: 44100 -> True [Monitor]: 44100 -> True serverAudioStarted Changed: 0 [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: serverReadChunkSize 2 [Voice Changer] update configuration: serverReadChunkSize 448 [Voice Changer] update configuration: serverReadChunkSize 64 [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: serverReadChunkSize 320 [Voice Changer] update configuration: passThrough false [Voice Changer] update configuration: modelSlotIndex 1704955094004 gin_channels: 256 self.spk_embed_dim: 109 [Voice Changer] generate new embedder. (no embedder) [Voice Changer] use torch contentvec [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed. [Voice Changer] exception! loading embedder PytorchStreamReader failed reading zip archive: failed finding central directory cuda:0 Traceback (most recent call last): File "voice_changer\RVC\embedder\EmbedderManager.py", line 47, in loadEmbedder File "voice_changer\RVC\embedder\OnnxContentvec.py", line 17, in loadModel File "onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in init File "onnxruntime\capi\onnxruntime_inference_collection.py", line 384, in _create_inference_session onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "voice_changer\RVC\pipeline\PipelineGenerator.py", line 30, in createPipeline File "voice_changer\RVC\embedder\EmbedderManager.py", line 26, in getEmbedder File "voice_changer\RVC\embedder\EmbedderManager.py", line 51, in loadEmbedder File "voice_changer\RVC\embedder\FairseqHubert.py", line 11, in loadModel File "fairseq\checkpoint_utils.py", line 425, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "fairseq\checkpoint_utils.py", line 315, in load_checkpoint_to_cpu state = torch.load(f, map_location=torch.device("cpu")) File "torch\serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "torch\serialization.py", line 283, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory [Voice Changer] update configuration: modelSlotIndex 1704955098001 [Voice Changer] generate new embedder. (no embedder) [Voice Changer] use torch contentvec [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed. [Voice Changer] exception! loading embedder PytorchStreamReader failed reading zip archive: failed finding central directory cuda:0 Traceback (most recent call last): File "voice_changer\RVC\embedder\EmbedderManager.py", line 47, in loadEmbedder File "voice_changer\RVC\embedder\OnnxContentvec.py", line 17, in loadModel File "onnxruntime\capi\onnxruntime_inference_collection.py", line 347, in init File "onnxruntime\capi\onnxruntime_inference_collection.py", line 384, in _create_inference_session onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from pretrain/content_vec_500.onnx failed:Protobuf parsing failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "voice_changer\RVC\pipeline\PipelineGenerator.py", line 30, in createPipeline File "voice_changer\RVC\embedder\EmbedderManager.py", line 26, in getEmbedder File "voice_changer\RVC\embedder\EmbedderManager.py", line 51, in loadEmbedder File "voice_changer\RVC\embedder\FairseqHubert.py", line 11, in loadModel File "fairseq\checkpoint_utils.py", line 425, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "fairseq\checkpoint_utils.py", line 315, in load_checkpoint_to_cpu state = torch.load(f, map_location=torch.device("cpu")) File "torch\serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "torch\serialization.py", line 283, in init super().init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory [Voice Changer] update configuration: passThrough true [Voice Changer] update configuration: passThrough false

ghost commented 8 months ago

I HAVE THE SAME ISSUE. I made a report too. And we have the same system (Windows 11)

ghost commented 8 months ago

Same issue on MacBook Pro M1 with OS Ventura

ghost commented 8 months ago
image

Using these settings worked for me.

Source: https://github.com/w-okada/voice-changer/issues/591#issuecomment-1762194075