When performing real-time conversion using the RVC model, if the source audio is too low, the converted audio will be high-pitched.
(The event has been confirmed with the sample Tsukuyomi-chan)
In order to suppress this, it is possible to adjust the pitch of the source audio in some way in real time and use it as an input to VC Client. Couldn't it be suppressed by changing it to a smaller value?
If it is possible, would it be difficult to implement a mechanism that allows it to be changed from a GUI, etc.?
Issue Type
Feature Request, Question
vc client version number
MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.13
OS
windows11
GPU
RTX4090
Clear setting
no
Sample model
no
Input chunk num
yes
Wait for a while
The GUI successfully launched.
read tutorial
yes
Extract files to a new folder.
no
Voice Changer type
RVC
Model type
pyTorch RVC f0
Situation
RVCモデルを使用してリアルタイム変換を行う場合、変換元の音声が低音過ぎると変換後の音声が高音になってしまいます。 (サンプルのつくよみちゃんで事象確認済み)
これを抑制するには、変換元の音声をなんらかの方法でリアルタイムにピッチを調整してそれをVC Clientの入力にする方法が考えられますが、例えば「RMVPOnnxEPitchExtractor.py」等のf0_minを現在の50より小さい値に変更することで抑制できないでしょうか? 仮にできるとしたら、それをGUI等から変更できるような仕組みを実装するのは難しいでしょうか?
When performing real-time conversion using the RVC model, if the source audio is too low, the converted audio will be high-pitched. (The event has been confirmed with the sample Tsukuyomi-chan)
In order to suppress this, it is possible to adjust the pitch of the source audio in some way in real time and use it as an input to VC Client. Couldn't it be suppressed by changing it to a smaller value? If it is possible, would it be difficult to implement a mechanism that allows it to be changed from a GUI, etc.?
application window capture
No response
logs on terminal
none