svc-develop-team / so-vits-svc

SoftVC VITS Singing Voice Conversion
GNU Affero General Public License v3.0
25.26k stars 4.74k forks source link

Fixed the error occurring under the only_diffusion path #307

Closed hxdnshx closed 1 year ago

hxdnshx commented 1 year ago

Python Version: Python 3.8.9

System: Windows 10 21H2 19044 x64

The following error occurred under the only_diffusion path due to the incorrect assignment of self.dtype and self.hps_ms:

(venv) H:\so-vits-svc>python inference_main.py -m "logs/44k/G_28000.pth" -c "configs/config.json" -n "moon_2.wav" -t 0 -eh -f0p crepe -s "zecil" -od -dc "configs/diffusion.yaml" -dm "logs/44k/diffusion/model_28000.pt" 
 [Loading] logs/44k/diffusion/model_28000.pt
Loaded diffusion model, sampler is dpm-solver++, speedup: 10 
load model(s) from pretrain/checkpoint_best_legacy_500.pt
| Load HifiGAN:  pretrain/nsf_hifigan/model
Removing weight norm...
#=====segment start, 8.98s======
Traceback (most recent call last):
  File "inference_main.py", line 155, in <module>
    main()
  File "inference_main.py", line 140, in main
    audio = svc_model.slice_inference(**kwarg)
  File "H:\so-vits-svc\inference\infer_tool.py", line 460, in slice_inference
    out_audio, out_sr, out_frame = self.infer(spk, tran, raw_path,
  File "H:\so-vits-svc\inference\infer_tool.py", line 280, in infer
    c = c.to(self.dtype)
AttributeError: 'Svc' object has no attribute 'dtype'

H:\so-vits-svc\venv\Scripts\python.exe D:/Jetbrain/apps/PyCharm-P/ch-0/231.8109.197/plugins/python/helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client 127.0.0.1 --port 37269 --file H:\so-vits-svc\inference_main.py -m logs/44k/G_28000.pth -c configs/config.json -n moon_2.wav -t 0 -eh -f0p crepe -s zecil -od -dc configs/diffusion.yaml -dm logs/44k/diffusion/model_28000.pt 
Connected to pydev debugger (build 231.8109.197)
 [Loading] logs/44k/diffusion/model_28000.pt
Loaded diffusion model, sampler is dpm-solver++, speedup: 10 
load model(s) from pretrain/checkpoint_best_legacy_500.pt
| Load HifiGAN:  pretrain/nsf_hifigan/model
Removing weight norm...
#=====segment start, 8.98s======
sample time step: 100%|██████████| 100/100 [00:01<00:00, 50.18it/s]
| Load HifiGAN:  pretrain/nsf_hifigan/model
Removing weight norm...
Traceback (most recent call last):
  File "H:\so-vits-svc\inference\infer_tool.py", line 333, in infer
    self.hps_ms.data.hop_length, 
AttributeError: 'Svc' object has no attribute 'hps_ms'

It should be noted that the configuration of the relevant parameters is only executed in the load_model and other paths that are not reached in only_diffusion, resulting in them not being initialized.

https://github.com/svc-develop-team/so-vits-svc/blob/847e71c3d7e22c3bf27ef741130888016789859c/inference/infer_tool.py#L188-L201

https://github.com/svc-develop-team/so-vits-svc/blob/847e71c3d7e22c3bf27ef741130888016789859c/inference/infer_tool.py#L137-L144

Therefore, I referred to the following code and supplemented the relevant definitions.

https://github.com/svc-develop-team/so-vits-svc/blob/847e71c3d7e22c3bf27ef741130888016789859c/diffusion/solver.py#L103-L115

However, the issue that still persists is that when using diffusion-only in conjunction with the --enhance option, an error continues to occur. The error message is as follows:

H:\so-vits-svc\venv\Scripts\python.exe D:/Jetbrain/apps/PyCharm-P/ch-0/231.8109.197/plugins/python/helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client 127.0.0.1 --port 38314 --file H:\so-vits-svc\inference_main.py -m logs/44k/G_28000.pth -c configs/config.json -n moon_2.wav -t 0 -f0p crepe -s zecil -od -dc configs/diffusion.yaml -dm logs/44k/diffusion/model_28000.pt -eh 
Connected to pydev debugger (build 231.8109.197)
 [Loading] logs/44k/diffusion/model_28000.pt
Loaded diffusion model, sampler is dpm-solver++, speedup: 10 
load model(s) from pretrain/checkpoint_best_legacy_500.pt
| Load HifiGAN:  pretrain/nsf_hifigan/model
Removing weight norm...
#=====segment start, 8.98s======
sample time step: 100%|██████████| 100/100 [00:02<00:00, 49.10it/s]
| Load HifiGAN:  pretrain/nsf_hifigan/model
Removing weight norm...
Traceback (most recent call last):
  File "H:\so-vits-svc\inference\infer_tool.py", line 329, in infer
    audio, _ = self.enhancer.enhance(
  File "H:\so-vits-svc\modules\enhancer.py", line 60, in enhance
    f0_res = np.interp(time_frame, time_org, f0_np, left=f0_np[0], right=f0_np[-1])
  File "<__array_function__ internals>", line 180, in interp
  File "H:\so-vits-svc\venv\lib\site-packages\numpy\lib\function_base.py", line 1570, in interp
    return interp_func(x, xp, fp, left, right)
ValueError: object too deep for desired array

Process finished with exit code 1