This is a fork of w-okada voice changer that performs real-time voice conversion using various voice conversion algorithms.
[!IMPORTANT] This version works only with Retrieval-based Voice Conversion (RVC).
The fork aims to improve the overall performance for any backend, and at the same time introducing new features and improving user experience.
The following videos demonstrate how the voice changer works and performs with AMD graphics cards (including integrated GPU!):
And this one demonstrates how the voice changer works and performs with Nvidia GeForce GTX 1650 laptop:
[!IMPORTANT] Minimum requirement means that you will be able to run ONLY the voice changer. Voice conversion and gaming at the same time will not provide satisfying experience with minimum requirements in most cases.
RAM: at least 6GB.
Disk space: at least 6GB of free disk space. For fast model loading, SSD is recommended.
Minimum requirement: Intel Core i5-4690K or AMD FX-6300.
Recommended requirement: Intel Core i5-10400F or AMD Ryzen 5 1600X.
Minimum VRAM required: 2GB (in FP32 mode), ~1GB (in FP16 mode, if supported).
Minimum requirement:
[!NOTE] It is also possible to use Nvidia GeForce GTX 700 series GPUs. However, they can be used only with DirectML version.
[!WARNING] The voice changer does not perform well with integrated Intel GPUs. This is a known issue that may be addressed in the future. You may proceed at your own risk and report issues or successful usage.
Recommended requirement:
A dedicated graphics card Nvidia GeForce RTX 20 Series or later, or AMD Radeon RX 6000 series or later, or Intel Arc A500 series or later.
When changing Chunk, Extra or Crossfade size settings, you must switch device to CPU then back to your GPU. Otherwise, performance issues can be observed.
Only rmvpe_onnx
, fcpe_onnx
, crepe_tiny_onnx
and crepe_full_onnx
are available in the list of F0 Det..
When using a laptop with integrated GPU and dedicated GPU, severely degraded performance (up to 50% reduction) can be observed when running the voice changer on built-in display.
Slightly degraded performance (up to 25% reduction) can be observed with multi-GPU setups.
AMD Radeon RX 7000 series may be unable to achieve low latency (below 256ms).
rest
protocol.[If not installed] Download and install VAC Lite by Muzychenko.
Navigate to the releases section.
Open Task Manager > Performance.
Click CPU, check and note the processor model on the right. An example: AMD Ryzen 7 5800H with Radeon Graphics.
Check and note graphics card models under GPU. An example:
GPU 0: AMD Radeon RX 6600M.
GPU 1: AMD Radeon(TM) Graphics.
[!TIP] For AMD users, the recommended driver version is
24.6.1
or later.
Download the voice-changer-windows-amd64-dml.zip
ZIP file.
Right-click the ZIP file. In the opened action menu select 7-Zip > Extract to "voice-changer-windows-amd64-dml\".
Make sure your Nvidia driver version is 528.33
or later. Click here to learn how to check your driver version.
Download the voice-changer-windows-amd64-cuda.zip.001
and voice-changer-windows-amd64-cuda.zip.002
ZIP files and place them in the same folder.
Right-click the voice-changer-windows-amd64-cuda.zip.001
ZIP file. In the opened action menu select 7-Zip > Extract to "voice-changer-windows-amd64-cuda\". This will unpack both files, no need to unpack them separately.
The following examples demonstrate the unpacking process:
Open the extracted folder (voice-changer-windows-amd64-dml
or voice-changer-windows-amd64-cuda
) > MMVCServerSIO
.
Run MMVCServerSIO.exe
.
When running the voice changer for the first time, it will start downloading necessary files. Do not close the window until the download finishes.
Once the download is finished, the voice changer will open the user interface using your default web browser.
[!IMPORTANT] macOS support is experimental.
Download the voice-changer-macos-arm64-cpu.tar.gz
file.
Double-click the file. The voice changer will unpack and the MMVCServerSIO
folder will appear.
[!NOTE] The voice changer would work best if your Intel-based machine has AMD graphics. If your machine has only Intel integrated graphics, only CPU will be utilized.
Download the voice-changer-macos-amd64-cpu.tar.gz
file.
Double-click the file. The voice changer will unpack and the MMVCServerSIO
folder will appear.
[!WARNING] Currently, this step is mandatory. Otherwise, the voice changer will fail to start with an error related to Python.framework being damaged. This may be improved in the future.
Open Terminal.
Run the following command:
xattr -dr com.apple.quarantine <Path to extracted MMVCServerSIO folder>
For example, if you extracted the voice changer to your desktop, the command may look as follows:
xattr -dr com.apple.quarantine ~/Desktop/MMVCServerSIO
Open the extracted MMVCServerSIO
folder.
Double-click MMVCServerSIO
to run the voice changer.
Refer to corresponding Colab or Kaggle notebooks in this repository and follow their instructions.
[!TIP] When any issue with the voice changer occurs, check the command line window (the one that opens during the start) for errors.
Either the remote files have changed or your files were corrupted. The error will show which files are affected above the error:
[WeightDownloader] 'pretrain/content_vec_500.onnx failed to pass hash verification check. Got 1931e237626b80d65ae44cbacd4a5197, expected ab288ca5b540a4a15909a40edf875d1e'
[WeightDownloader] 'pretrain/rmvpe.onnx failed to pass hash verification check. Got 65030149d579a65f15aa7e85769c32f1, expected b6979bf69503f8ec48c135000028a7b0'
Find and delete the mentioned files from the voice changer folder and restart the voice changer. Deleted files will be re-downloaded.
Make sure that you have given the permission to access the microphone.
If you are using Mozilla Firefox ESR, there may be an issue with audio devices. Use other web browser (preferably Chrome or Chromium-based).
Make sure you have selected correct input and output audio devices.
Make sure your input device is not muted. Check the microphone volume in the system settings or hardware switch on your headset (usually a button, if present).
In the voice changer, make sure passthru is not on (indicated by blinking red color). Click it to switch it off (indicated by solid green color).
Make sure you are using VAC by Muzychenko (indicated by the Line 1 audio device name).
In Windows Sound Control Panel, make sure that the sample rate of your microphone matches the sample rate of the virtual cable.
The following example shows the configuration of the virtual cable and the microphone:
If nothing helped, in Task Manager > Details, find "audiodg.exe" process and do the folowing:
Right-click "audiodg.exe" > Set priority > High.
Right-click "audiodg.exe" > Set affinity. Uncheck every option, then only select CPU 2.
If you changed chunk when voice conversion was on, click Stop then Start again.
Make sure the perf time is smaller than Chunk. Increase Chunk or reduce Extra and Crossfade size.
At the moment, the fork does not accept any code contributions. However, feel free to report any issues you encounter during usage.
[If not installed] Download and install Python 3.10.
[If not installed] Download and install git.
Open a command line.
Verify your Python version by running the following command:
python --version
Python 3.10.8
Clone the repository.
Navigate to the server
folder.
[If not set up] Set up virtual environment with the following command:
python -m venv venv
Activate virtual environment using one of the following commands:
For Windows:
.\venv\Scripts\activate.ps1
For Linux/macOS:
source ./venv/bin/activate
Install the requirements using one of the following commands:
For AMD/Intel/CPU (Windows only):
pip install -r requirements-common.txt -r requirements-dml.txt
For Nvidia (any OS):
pip install -r requirements-common.txt -r requirements-cuda.txt
For AMD ROCm (Linux only):
pip install -r requirements-common.txt -r requirements-rocm.txt
For CPU (Linux/macOS only):
pip install -r requirements-common.txt -r requirements-cpu.txt
Run the server by executing main.py
.
python ./main.py
This will run the server with default settings. Note that it will not open the web browser by default, copy the address from command line.
[If not installed] Install pyinstaller
with the following command:
pip install --upgrade pip wheel setuptools pyinstaller
Run the following command to build an executable:
pyinstaller --clean -y --dist ./dist --workpath /tmp MMVCServerSIO.spec
This will output the resulting executable in the dist
folder.