RVC-Project / Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!
MIT License
24.6k stars 3.61k forks source link

推理失败,卡Loading rmvpe model,assets/rmvpe/rmvpe.pt #2369

Open lonrencn opened 2 weeks ago

lonrencn commented 2 weeks ago

ubuntu 24.04 4GPU 2080TI 22G 2680 V4 128G 2406版本 所要配置文件均为默认

几个问题: 1、训练一键启动只能完成第一步,下面的步骤需要手动启动。

2、步骤3,训练时,3-4张 GPU 报错,只能两张或单张运行 报提示:/mnt/disk2t/RVC/infer/modules/train/train.py:429: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.

3、推理时 2024-11-03 20:15:01 | INFO | infer.modules.vc.modules | Get sid: boy1.pth 2024-11-03 20:15:01 | INFO | infer.modules.vc.modules | Loading: assets/weights/boy1.pth 2024-11-03 20:15:02 | INFO | infer.modules.vc.modules | Select index: logs/liu/added_IVF370_Flat_nprobe_1_boy1_v2.index 2024-11-03 20:15:14 | INFO | fairseq.tasks.hubert_pretraining | current directory is /mnt//RVC 2024-11-03 20:15:14 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False} 2024-11-03 20:15:14 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] 4 + [(512,2,2)] * 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False} 2024-11-03 20:15:20 | INFO | infer.modules.vc.pipeline | Loading rmvpe model,assets/rmvpe/rmvpe.pt

卡住不动

经查卡在 codedataset.py 循环8次不动了

        for t in opt_ts:
            print('跑到这了!!')
            t = t // self.window * self.window
            if if_f0 == 1:
                audio_opt.append(
                    self.vc(
                        model,
                        net_g,
                        sid,
                        audio_pad[s : t + self.t_pad2 + self.window],
                        pitch[:, s // self.window : (t + self.t_pad2) // self.window],
                        pitchf[:, s // self.window : (t + self.t_pad2) // self.window],
                        times,
                        index,
                        big_npy,
                        index_rate,
                        version,
                        protect,
                    )[self.t_pad_tgt : -self.t_pad_tgt]
                )
            else:
                audio_opt.append(
                    self.vc(
                        model,
                        net_g,
                        sid,
                        audio_pad[s : t + self.t_pad2 + self.window],
                        None,
                        None,
                        times,
                        index,
                        big_npy,
                        index_rate,
                        version,
                        protect,
                    )[self.t_pad_tgt : -self.t_pad_tgt]
                )
            s = t
ninthseason commented 1 week ago

问题1和3我也遇到了,现象一模一样。也是ubuntu 24.04系统。

使用conda环境的python3.8,包版本如下:

Package                   Version
------------------------- ------------
absl-py                   2.1.0
aiofiles                  24.1.0
aiohappyeyeballs          2.4.3
aiohttp                   3.10.10
aiosignal                 1.3.1
altair                    5.4.1
antlr4-python3-runtime    4.8
anyio                     4.5.2
aria2                     0.0.1b0
async-timeout             4.0.3
attrs                     24.2.0
audioread                 3.0.1
av                        12.3.0
bitarray                  3.0.0
cachetools                5.5.0
certifi                   2022.12.7
cffi                      1.17.1
charset-normalizer        2.1.1
click                     8.1.7
colorama                  0.4.6
coloredlogs               15.0.1
contourpy                 1.1.1
cycler                    0.12.1
Cython                    3.0.11
decorator                 5.1.1
einops                    0.8.0
exceptiongroup            1.2.2
fairseq                   0.12.2
faiss-cpu                 1.7.3
fastapi                   0.88.0
ffmpeg-python             0.2.0
ffmpy                     0.3.1
filelock                  3.13.1
flatbuffers               24.3.25
fonttools                 4.54.1
frozenlist                1.5.0
fsspec                    2024.2.0
future                    1.0.0
google-auth               2.36.0
google-auth-oauthlib      1.0.0
gradio                    3.34.0
gradio_client             1.3.0
grpcio                    1.67.1
h11                       0.14.0
httpcore                  1.0.6
httpx                     0.27.2
huggingface-hub           0.26.2
humanfriendly             10.0
hydra-core                1.0.7
idna                      3.4
importlib_metadata        8.5.0
importlib_resources       6.4.5
Jinja2                    3.1.3
joblib                    1.4.2
json5                     0.9.25
jsonschema                4.23.0
jsonschema-specifications 2023.12.1
kiwisolver                1.4.7
librosa                   0.9.1
linkify-it-py             2.0.3
llvmlite                  0.39.0
local-attention           1.9.15
lxml                      5.3.0
Markdown                  3.7
markdown-it-py            2.2.0
MarkupSafe                2.1.5
matplotlib                3.7.5
matplotlib-inline         0.1.7
mdit-py-plugins           0.3.3
mdurl                     0.1.2
mpmath                    1.3.0
multidict                 6.1.0
narwhals                  1.13.3
networkx                  3.0
numba                     0.56.4
numpy                     1.23.5
oauthlib                  3.2.2
omegaconf                 2.0.6
onnxruntime-gpu           1.19.2
orjson                    3.10.11
packaging                 24.2
pandas                    2.0.3
pillow                    10.2.0
pip                       23.0
pkgutil_resolve_name      1.3.10
platformdirs              4.3.6
pooch                     1.8.2
portalocker               2.10.1
praat-parselmouth         0.4.5
propcache                 0.2.0
protobuf                  5.28.3
pyasn1                    0.6.1
pyasn1_modules            0.4.1
pycparser                 2.22
pydantic                  1.10.19
pydub                     0.25.1
Pygments                  2.18.0
pyparsing                 3.1.4
python-dateutil           2.9.0.post0
python-dotenv             1.0.1
python-multipart          0.0.17
pytz                      2024.2
pyworld                   0.3.2
PyYAML                    6.0.2
referencing               0.35.1
regex                     2024.11.6
requests                  2.28.1
requests-oauthlib         2.0.0
resampy                   0.4.3
rpds-py                   0.20.1
rsa                       4.9
sacrebleu                 2.4.3
scikit-learn              1.3.2
scipy                     1.10.1
semantic-version          2.10.0
setuptools                75.3.0
six                       1.16.0
sniffio                   1.3.1
soundfile                 0.12.1
starlette                 0.22.0
sympy                     1.13.1
tabulate                  0.9.0
tensorboard               2.14.0
tensorboard-data-server   0.7.2
tensorboardX              2.6.2.2
threadpoolctl             3.5.0
torch                     2.1.2+cu118
torchaudio                2.1.2+cu118
torchcrepe                0.0.20
torchfcpe                 0.0.4
torchvision               0.16.2+cu118
tornado                   6.4.1
tqdm                      4.67.0
traitlets                 5.14.3
triton                    2.1.0
typing_extensions         4.12.2
tzdata                    2024.2
uc-micro-py               1.0.3
urllib3                   1.26.13
uvicorn                   0.32.0
websockets                12.0
Werkzeug                  3.0.6
wheel                     0.45.0
yarl                      1.15.2
zipp                      3.20.2
ninthseason commented 1 week ago

我在调试中发现了问题3的一些眉目,(在我的情况中)提问人所说的“卡住”只是看似卡住,实际上并没有卡住。事实上模型是跑完的,音频输出也正常生成,但是由于未知原因gradio前端在推理完成前就显示了ERROR。如图: image

换言之,推理实际上正常结束了。只是输出结果没有正常显示在浏览器页面上。

具体来说,代码成功跑至https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/3548b4f1a55336629955c0d51deeb24b6de9c46e/infer/modules/vc/modules.py#L217-L221 其中tgt_sr是采样率,audio_opt是音频数据,这个return应该是把这两个结果返回给gradio

如果在此打上断点,待执行至此处时,于调试控制台中运行:

from scipy.io import wavfile
wavfile.write('output_audio.wav', tgt_sr, audio_opt)

可以手动将模型的输出保存至根目录output_audio.wav文件。 可以发现输出正常生成。

所以这个问题的本质应该是gradio前端因某个原因提前直接显示了ERROR,导致推理完成后的结果无法显示。

ninthseason commented 1 week ago

仔细观察了一下发现每次都是在运行5s后准时出现ERROR,应该是触发了gradio的5秒超时 gradio那边的issue: https://github.com/gradio-app/gradio/issues/5143

wyxgoishin commented 3 days ago

仔细观察了一下发现每次都是在运行5s后准时出现ERROR,应该是触发了gradio的5秒超时 gradio那边的issue: gradio-app/gradio#5143

可以手动设置超时时间:

    if config.iscolab:
        app.queue(concurrency_count=511, max_size=1022).launch(share=True)
    else:
        app.queue(concurrency_count=511, max_size=1022).launch(
            server_name="0.0.0.0",
            inbrowser=not config.noautoopen,
            server_port=config.listen_port,
            quiet=True,
            app_kwargs={"timeout": 30}
        )
webbglass commented 2 days ago

Changing the httpx timeout works: https://github.com/gradio-app/gradio/issues/5143#issuecomment-2208338833

Or you can manually edit the file:

  1. Retrieve the path of _config.py via python -c "import httpx; import os; print(os.path.join(os.path.dirname(httpx.__file__), '_config.py'))
  2. Change the line in _config.py to DEFAULT_TIMEOUT_CONFIG = Timeout(timeout=60.0)
mozz85 commented 1 day ago

2355 这里有说该项目requirements.txt中gradio版本bug有点多,可以试一下这里他推荐的gradio版本