Traceback (most recent call last):
File "/home/cross/Documents/so-vits-svc/inference_main.py", line 155, in <module>
main()
File "/home/cross/Documents/so-vits-svc/inference_main.py", line 140, in main
audio = svc_model.slice_inference(**kwarg)
File "/home/cross/Documents/so-vits-svc/inference/infer_tool.py", line 470, in slice_inference
out_audio, out_sr, out_frame = self.infer(spk, tran, raw_path,
File "/home/cross/Documents/so-vits-svc/inference/infer_tool.py", line 297, in infer
audio,f0 = self.net_g_ms.infer(c, f0=f0, g=sid, uv=uv, predict_f0=auto_predict_f0, noice_scale=noice_scale,vol=vol)
File "/home/cross/Documents/so-vits-svc/sov/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/cross/Documents/so-vits-svc/models.py", line 483, in infer
z_p, m_p, logs_p, c_mask = self.enc_p(x, x_mask, f0=f0_to_coarse(f0), noice_scale=noice_scale)
File "/home/cross/Documents/so-vits-svc/sov/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/cross/Documents/so-vits-svc/models.py", line 117, in forward
x = self.enc_(x * x_mask, x_mask)
File "/home/cross/Documents/so-vits-svc/sov/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/cross/Documents/so-vits-svc/modules/attentions.py", line 83, in forward
y = self.attn_layers[i](x, x, attn_mask)
File "/home/cross/Documents/so-vits-svc/sov/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/cross/Documents/so-vits-svc/modules/attentions.py", line 187, in forward
x, self.attn = self.attention(q, k, v, mask=attn_mask)
File "/home/cross/Documents/so-vits-svc/modules/attentions.py", line 219, in attention
relative_weights = self._absolute_position_to_relative_position(p_attn)
File "/home/cross/Documents/so-vits-svc/modules/attentions.py", line 285, in _absolute_position_to_relative_position
x_flat = F.pad(x_flat, commons.convert_pad_shape([[0, 0], [0, 0], [length, 0]]))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.60 GiB (GPU 0; 23.65 GiB total capacity; 18.68 GiB already allocated; 1.17 GiB free; 20.56 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Screenshot so-vits-svc and logs/44k folders and paste here
Please check the checkboxes below.
OS version
Arch Linux
GPU
RTX 4090
Python version
Python 3.10.12
PyTorch version
Version: 2.0.1+cu118
Branch of sovits
4.0(Default)
Dataset source (Used to judge the dataset quality)
UVR Processed
Where thr problem occurs or what command you executed
python inference_main.py -m "logs/44k/G_400.pth" -c "configs/config.json" -n "hellovocals.wav" -t -6 -s "bailey" --f0_predictor fcpe --shallow_diffusion --k_step 1000
Problem description
It seems to be an OOM error
Log
Screenshot
so-vits-svc
andlogs/44k
folders and paste hereSupplementary description
No response