w-okada / voice-changer

リアルタイムボイスチェンジャー Realtime Voice Changer
Other
15.91k stars 1.72k forks source link

[ISSUE for v2]: "pth" file cannot be read. Audio conversion is not performed. #1274

Closed Rakkyotan closed 1 month ago

Rakkyotan commented 1 month ago

Voice Changer Version

vcclient_win_cuda_2.0.40-alpha.zip

Operational System

Windows 11

GPU

NVIDIA Geforce RTX 4070 Ti

CUDA Version

12.5

Read carefully and check the options

Does pre-installed model work?

No

Model Type

MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.18a.zip

Issue Description

I use it all the time. Now, when I read the "pth" file from around "vcclient_win_cuda_2.0.27-alpha_1", an error started to occur in the "Voice Changer Info" in the Ver1 file. ( "pyTorchRVC,40000,hubert_base_l9fp" No error occurs with Ver2.)

"Voice Changer Info" { "detail": { "errors": "'RVCInferencerv1F0' object has no attribute 'file'", "errors2": "422: {'errors': \"'RVCInferencerv1F0' object has no attribute 'file'\", 'body': ''}" } }

However, even if an error occurs, "vcclient_win_cuda_2.0.27-alpha_1" is able to convert audio in both Ver1 and Ver2.


And, with "vcclient_mac_2.0.40-alpha" that I downloaded the other day, it is no longer possible to convert audio using "pth" files in both Ver1 and Ver2.

"Log Viewer" (Logs on console)

I was able to convert the "onnx" file into audio without any problems. I would appreciate your feedback. Thank you. (If Japanese is okay, I will post in Japanese from now on)

Application Screenshot

No response

Logs on console

2024-07-15 16:49:45,218 - uvicorn.ac - h11_impl - INFO - 127.0.0.1:52851 - "GET /api/voice-changer-manager/information HTTP/1.1" 200 - uvicorn\protocols\http\h11_impl.py - 477 2024-07-15 16:49:45,592 - vcclient - voice_changer - WARNING - data type resampled is short. padded.:(7857,), shape:(8000,) - vcclient_dev\voice_changer\voice_change_manager\voice_changer.py - 484 2024-07-15 16:49:45,593 - vcclient - rvc_pipeline - INFO - noise gate 2.0722931509954373e-06 < -40.0 - vcclient_dev\voice_changer\voice_change_manager\vc_pipelines\rvc_pipeline.py - 175 2024-07-15 16:49:46,219 - vcclient - voice_changer_manage - INFO - Getting voice changer manager information. - vcclient_dev\voice_changer\voice_change_manager\voice_changer_manager.py - 22 2024-07-15 16:49:46,219 - vcclient - vcserver_rest_api_vo - INFO - get_voice_changer_information local_voice_changer_interface_active=False voice_changer_information=VoiceChangerInformation(slot_index=8, pitch_estimator_type='fcpe', gpu_device_index=0, input_sample_rate=48000, output_sample_rate=48000, monitor_sample_rate=48000, vc_input_sample_rate=16000, vc_output_sample_rate=40000, resample_ratio_in=0.3333333333333333, resample_ratio_out=1.2, resample_ratio_monitor=1.2, resample_ratio_pass_through_in_out=1.0, resample_ratio_pass_through_in_monitor=1.0, enable_high_pass_filter=False, high_pass_filter_cutoff=100.0, enable_low_pass_filter=False, low_pass_filter_cutoff=10000.0, chunk_sec=0.5, pipeline_info=RVCPipelineInfo(slot_index=8, input_sample_rate=16000, output_sample_rate=40000, chunk_sec=0.5, slot_info={'slot_index': 8, 'voice_changer_type': 'RVC', 'name': 'mine_kira', 'description': '', 'credit': '', 'terms_of_use_url': '', 'icon_file': None, 'speakers': {}, 'model_file': WindowsPath('mine_kira.pth'), 'index_file': None, 'is_onnx': False, 'inferencer_type': 'pyTorchRVCv2', 'sample_rate': 40000, 'is_f0': True, 'deprecated': False, 'embedder': 'hubert_base_l12', 'pitch_estimator': 'fcpe', 'sample_id': None, 'version': 'v2', 'chunk_sec': 0.5, 'pitch_shift': 14, 'index_ratio': 0.0, 'protect_ratio': 0.5}, embedder_info=EmbedderInfo(embedder_type='contentvec', model_file=WindowsPath('modules/contentvec/contentvec-f.onnx'), device_id=0, candidate_onnx_providers=['CUDAExecutionProvider'], candidate_onnx_provider_options="[{'device_id': 0}]", onnx_providers=['CUDAExecutionProvider', 'CPUExecutionProvider'], onnx_provider_options="{'CUDAExecutionProvider': {'cudnn_conv_algo_search': 'EXHAUSTIVE', 'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0'}, 'CPUExecutionProvider': {}}"), pitch_estimator_info=PitchEstimatorInfo(pitch_estimator_type='fcpe', model_file=None, device_id=0, candidate_onnx_providers=None, candidate_onnx_provider_options=None, onnx_providers=None, onnx_provider_options=None), inferencer_info=RVCInferencerInfo(inferencer_type='pyTorchRVCv2', model_file=WindowsPath('model_dir/8/mine_kira.pth'), device_id=0, candidate_onnx_providers=None, candidate_onnx_provider_options=None, onnx_providers=None, onnx_provider_options=None)), voice_changer_type='RVC', bulk_process_start_flag=True, recording_start_flag=False, monitor_enabled=False) - vcclient_dev\server\vcserver_rest_api_voice_changaer.py - 107 2024-07-15 16:49:46,219 - uvicorn.ac - h11_impl - INFO - 127.0.0.1:52851 - "GET /api/voice-changer-manager/information HTTP/1.1" 200 - uvicorn\protocols\http\h11_impl.py - 477 2024-07-15 16:49:46,702 - vcclient - voice_changer - WARNING - Failed to convert_chunk_bulk_internal:can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. - vcclient_dev\voice_changer\voice_change_manager\voice_changer.py - 361 2024-07-15 16:49:46,704 - vcclient - voice_changer - WARNING - Failed to convert_chunk_bulk_internal:Traceback (most recent call last): File "vcclient_dev\voice_changer\voice_change_manager\voice_changer.py", line 351, in convert_chunk_bulk_internal File "vcclient_dev\voice_changer\voice_change_manager\voice_changer.py", line 378, in convert_chunk File "vcclient_dev\voice_changer\voice_change_manager\voice_changer.py", line 485, in _convert_chunk File "vcclient_dev\voice_changer\voice_change_manager\vc_pipelines\rvc_pipeline.py", line 255, in run File "torch_tensor.py", line 1087, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Kuuko-fokkusugaru commented 1 month ago

I have the same issue but the included ONNX models work (but very poorly and robotic). I attach my logs file. logs.txt

Rakkyotan commented 1 month ago

Thank you for posting. I also worked with the included "onnx" and the file converted from "pth" to "onnx".

w-okada commented 1 month ago

I can not reproduce. Please use v.2.0.44 alpha, which includes quality improvement.

Mikey-Mikey commented 1 month ago

Yeah, I'm able to use .pth models in 2.0.44 alpha. I think it might be fixed in this version

Rakkyotan commented 1 month ago

Thank you very match. Due to various circumstances, I cannot confirm it now. I'll try it on Monday

w-okada commented 1 month ago

please try 2.0.45 alpha

w-okada commented 1 month ago

no response close