PlayVoice / whisper-vits-svc

Core Engine of Singing Voice Conversion & Singing Voice Clone
https://huggingface.co/spaces/maxmax20160403/sovits5.0
MIT License
2.55k stars 914 forks source link

Post-Inferred WAV files produced with blank spots #162

Closed ThatJeffGuy closed 6 months ago

ThatJeffGuy commented 6 months ago

Hello friends!

Running the latest version of the repo and all associated packages - everything is working fine, and my model at 1500 epochs does output decent samples, however every audio file has blank spots in it.

The blank spots are always in the same spot for the song - so as an example, if I make my model sing Rap God by Eminem, sadly, the blank spot is over the really fast section of the song. Up until then, it's working wonderfully. See the attached input and output wav files in audacity.

No matter how many times I re-export (run inference script), the wav file and the blank spot is always in the pictured spot. If I change songs, it generates a blank spot in the output file every time and with every song/file, and that blank spot will be randomly placed somewhere in the file, and its length varies (but again, re-exporting keeps the position/length of this blank spot.. weirdly..)

The pitch files of CSV's appear fine without any blank spots upon review.

I down sampled the input to mono 44100, then preprocessed it using the provided commands in the read me, with some small changes to paths, and it works without error, just some user warnings about depreciated commands :P

Audacity_QXMe0YoLXc

I have uploaded my model and files to HF: https://huggingface.co/ScottishHaze/PayMoneyWubby

The command I'm running is python svc_inference.py --config configs/base.yaml --model ./chkpt/paymoneywubby/paymoneywubby_0160.pt --spk ./data_svc/singer/wubby.spk.npy --wave ./wubbyfiles/iwish.wav --shift 0 --enable-retrieval --retrieval-ratio 0.5 --n-retrieval-vectors 3 --hubert-index-path ./data_svc/indexes/wubby/hubert.index --whisper-index-path ./data_svc/indexes/wubby/whisper.index

Pathing changes with the above obviously to the different files.

I did switch to a different checkpoint and it produced the same files/blank spots.

ThatJeffGuy commented 6 months ago

Another screenshot - I re-exported these moments before typing this comment, and I did it 5 times per file, just to be sure. Audacity_YpTlxSb61F

ThatJeffGuy commented 6 months ago

--n-retrieval-vectors 3 changed to 2 fixed this issue.