CircuitCM / RVC-inference

High performance RVC inferencing, intended for multiple instances in memory at once. Also includes the latest pitch estimator RMVPE, Python 3.8-3.11 compatible, pip installable, memory + performance improvements in the pipeline and model usage.
MIT License
19 stars 3 forks source link

RVC Inference

English | 中文简体 | 日本語 | 한국어 | Français| Türkçe

Translations provided by GPT-4.

This project is a lightweight, fast, and memory efficient api that runs v1/v2 RVC models. It is intended for use in production environments and compatibility with existing codebases. It makes integrating RVC as a stage in a pipeline or workflow easy. Installation is quick using pip and should be compatible with Linux/Windows/Mac and the latest python versions.

Install

If using Python 3.11+ install the fairseq fork first as fairseq is not yet compatible with 3.11. (Will take a minute).

pip install https://github.com/One-sixth/fairseq/archive/main.zip

Pip install the repo like below and all dependencies will be installed automatically.

pip uninstall inferrvc
pip install https://github.com/CircuitCM/RVC-inference/raw/main/dist/inferrvc-1.0-py3-none-any.whl --no-cache-dir

By default pypi installs the pytorch cpu build. To install for gpu using Nvidia or AMD, visit https://pytorch.org/get-started/locally/ and pip install torch and torchaudio with gpu before installing this library.

Support should be available for Python 3.8-3.12 but only 3.11 was tested. If there are any problems with installation or compatibility please open an issue and I'll push out fixes. PR's with fixes and improvements are welcome.

Usage

First set the optional environment variables:

import os
os.environ['RVC_MODELDIR']='path/to/rvc_model_dir' #where model.pth files are stored.
os.environ['RVC_INDEXDIR']='path/to/rvc_index_dir' #where model.index files are stored.
#the audio output frequency, default is 44100.
os.environ['RVC_OUTPUTFREQ']='44100'
#If the output audio tensor should block until fully loaded, this can be ignored. But if you want to run in a larger torch pipeline, setting to False will improve performance a little.
os.environ['RVC_RETURNBLOCKING']='True'

Notes on environment variables:

Load models:

from inferrvc import RVC
whis,obama=RVC('Whis.pth',index='added_IVF1972_Flat_nprobe_1_Whis_v2'),RVC(model='obama')

print(whis.name)
print('Paths',whis.model_path,whis.index_path)
print(obama.name)
print('Paths',obama.model_path,obama.index_path)
Model: Whis, Index: added_IVF1972_Flat_nprobe_1_Whis_v2
Paths Z:\Models\RVC\Models\Whis.pth Z:\Models\RVC\Indexes\added_IVF1972_Flat_nprobe_1_Whis_v2.index
Model: obama, Index: obama
Paths Z:\Models\RVC\Models\obama.pth Z:\Models\RVC\Indexes\obama.index

Run Inferencing:

from inferrvc import load_torchaudio
aud,sr = load_torchaudio('path/to/audio.wav')

paudio1=whis(aud,f0_up_key=6,output_device='cpu',output_volume=RVC.MATCH_ORIGINAL,index_rate=.75)
paudio2=obama(aud,5,output_device='cpu',output_volume=RVC.MATCH_ORIGINAL,index_rate=.9)

import soundfile as sf

sf.write('path/to/audio_whis.wav',paudio1,44100)
sf.write('path/to/audio_obama.wav',paudio2,44100)

Whis example.
Obama example.

Changes from the original repo:

Todos: