winddori2002 / TriAAN-VC

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
MIT License
129 stars 12 forks source link

Memory issue/bug #18

Closed asusdisciple closed 8 months ago

asusdisciple commented 8 months ago

May I ask if you have any experience with strange memory behaviour? I tried to do inference on a V100 with 32GB memory. However it seems the model tries to allocate more than 25 GB which does not make sense if you used a single RTX 3090. By the way the GPU is completely free (RAM) according to nvidia-smi before I start convert.py Here is my error log:

[Config]
data_path: ./base_data
wav_path: ./vctk/wav48_silence_trimmed
txt_path: ./vctk/txt
spk_info_path: ./vctk/speaker-info.txt
converted_path: ./checkpoints/converted_None_uttr
vocoder_path: ./vocoder
cpc_path: ./cpc
n_uttr: None
setting: 
    sampling_rate: 16000
    top_db: 60
    n_mels: 80
    n_fft: 400
    n_shift: 160
    win_length: 400
    window: hann
    fmin: 80
    fmax: 7600
    s2s_portion: 0.1
    eval_spks: 10
    n_frames: 128
model: 
    encoder: 
        c_in: 256
        c_h: 512
        c_out: 4
        num_layer: 6
    decoder: 
        c_in: 4
        c_h: 512
        c_out: 80
        num_layer: 6
train: 
    epoch: 500
    batch_size: 64
    lr: 1e-4
    loss: l1
    eval_every: 100
    save_epoch: 100
    siam: True
    cpc: True
test: 
    threshold: 0.6895345449450861
_name: Config
config: ./config/base.yaml
device: cuda:0
sample_path: ./samples
src_name: 
    - csmd024.wav
trg_name: 
    - One.wav
checkpoint: ./checkpoints
model_name: model-cpc-split.pth
seed: 1234
ex_name: TriAAN-VC
Traceback (most recent call last):
  File "/raid/YX/TriAAN-VC/convert.py", line 153, in <module>
    main(cfg)
  File "/raid/YX/TriAAN-VC/convert.py", line 119, in main
    output = model(src_feat, src_lf0, trg_feat)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/raid/YX/TriAAN-VC/model/model.py", line 184, in forward
    trg, trg_skips = self.spk_encoder(trg)  # target: spk
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/raid/YX/TriAAN-VC/model/model.py", line 43, in forward
    x = block(x)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/container.py", line 215, in forward
    input = module(input)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/raid/YX/TriAAN-VC/model/attention.py", line 55, in forward
    attn = self.softmax(attn)              
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/modules/activation.py", line 1514, in forward
    return F.softmax(input, self.dim, _stacklevel=5)
  File "/home/YX/.virtualenvs/TriAAN-VC/venv/lib/python3.10/site-packages/torch/nn/functional.py", line 1856, in softmax
    ret = input.softmax(dim)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 25.84 GiB. GPU 0 has a total capacty of 31.74 GiB of which 1.81 GiB is free. Including non-PyTorch memory, this process has 29.91 GiB memory in use. Of the allocated memory 27.82 GiB is allocated by PyTorch, and 1.71 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
winddori2002 commented 8 months ago

Hi, In my case, the memory is about 3GB for inference using "convert.py". (The samples I used) And 17GB for training with the original training setting.

However, if you use different samples it can be possible. (When the target or source sample is quite long.)

If the target speech is quite long, you can take another process for conversion.

Thanks/

asusdisciple commented 8 months ago

Indeed it was the long sample, this solved my problem thanks a lot!