blaisewf / rvc-cli

🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!
https://rvc-cli.pages.dev/
Other
120 stars 30 forks source link

Crash when inference #74

Closed juangea closed 2 weeks ago

juangea commented 2 weeks ago

SO I tried this command:

.\rvc_cli.py infer --input_path X:/BS_soft/Unify_Voice/output/test_1.wav --output_path X:/BS_soft/Unify_Voice/output/test_1_cli.wav --pth_path X:/BS_soft/rvc_pipeline/rvc-cli/voices/weights/Juan_02.pth --index_path X:/BS_soft/rvc_pipeline/rvc-cli/voices/index/Juan_02_IVF340_Flat_nprobe_1_Juan_02_v2.index

And it crases right away, I don't know why, here is the log:

X:\BS_soft\rvc_pipeline\rvc-cli\env\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
2024-08-26 11:33:37 | INFO | faiss.loader | Loading faiss with AVX2 support.
2024-08-26 11:33:37 | INFO | faiss.loader | Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
2024-08-26 11:33:37 | INFO | faiss.loader | Loading faiss.
2024-08-26 11:33:37 | INFO | faiss.loader | Successfully loaded faiss.
Converting audio 'X:/BS_soft/Unify_Voice/output/test_1.wav'...
An error occurred during audio conversion: Could not infer task type from {'_name': 'contentvec_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'crop': True, 'pad_audio': False, 'spk2info': 'spk2info.dict'}. Available argparse tasks: dict_keys(['audio_pretraining', 'audio_finetuning', 'cross_lingual_lm', 'denoising', 'speech_to_text', 'text_to_speech', 'frm_text_to_speech', 'hubert_pretraining', 'language_modeling', 'legacy_masked_lm', 'masked_lm', 'multilingual_denoising', 'multilingual_language_modeling', 'multilingual_masked_lm', 'speech_unit_modeling', 'translation', 'multilingual_translation', 'online_backtranslation', 'semisupervised_translation', 'sentence_prediction', 'sentence_prediction_adapters', 'sentence_ranking', 'simul_speech_to_text', 'simul_text_to_text', 'speech_to_speech', 'translation_from_pretrained_bart', 'translation_from_pretrained_xlm', 'translation_lev', 'translation_multi_simple_epoch', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt']). Available hydra tasks: dict_keys(['audio_pretraining', 'audio_finetuning', 'hubert_pretraining', 'language_modeling', 'masked_lm', 'multilingual_language_modeling', 'speech_unit_modeling', 'translation', 'sentence_prediction', 'sentence_prediction_adapters', 'simul_text_to_text', 'translation_from_pretrained_xlm', 'translation_lev', 'dummy_lm', 'dummy_masked_lm'])
Traceback (most recent call last):
  File "X:\BS_soft\rvc_pipeline\rvc-cli\rvc\infer\infer.py", line 189, in convert_audio
    self.load_hubert(embedder_model, embedder_model_custom)
  File "X:\BS_soft\rvc_pipeline\rvc-cli\rvc\infer\infer.py", line 58, in load_hubert
    models, _, _ = load_embedding(embedder_model, embedder_model_custom)
  File "X:\BS_soft\rvc_pipeline\rvc-cli\rvc\lib\utils.py", line 69, in load_embedding
    models = checkpoint_utils.load_model_ensemble_and_task(
  File "X:\BS_soft\rvc_pipeline\rvc-cli\env\lib\site-packages\fairseq\checkpoint_utils.py", line 436, in load_model_ensemble_and_task
    task = tasks.setup_task(cfg.task)
  File "X:\BS_soft\rvc_pipeline\rvc-cli\env\lib\site-packages\fairseq\tasks\__init__.py", line 42, in setup_task
    assert (
AssertionError: Could not infer task type from {'_name': 'contentvec_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'crop': True, 'pad_audio': False, 'spk2info': 'spk2info.dict'}. Available argparse tasks: dict_keys(['audio_pretraining', 'audio_finetuning', 'cross_lingual_lm', 'denoising', 'speech_to_text', 'text_to_speech', 'frm_text_to_speech', 'hubert_pretraining', 'language_modeling', 'legacy_masked_lm', 'masked_lm', 'multilingual_denoising', 'multilingual_language_modeling', 'multilingual_masked_lm', 'speech_unit_modeling', 'translation', 'multilingual_translation', 'online_backtranslation', 'semisupervised_translation', 'sentence_prediction', 'sentence_prediction_adapters', 'sentence_ranking', 'simul_speech_to_text', 'simul_text_to_text', 'speech_to_speech', 'translation_from_pretrained_bart', 'translation_from_pretrained_xlm', 'translation_lev', 'translation_multi_simple_epoch', 'dummy_lm', 'dummy_masked_lm', 'dummy_mt']). Available hydra tasks: dict_keys(['audio_pretraining', 'audio_finetuning', 'hubert_pretraining', 'language_modeling', 'masked_lm', 'multilingual_language_modeling', 'speech_unit_modeling', 'translation', 'sentence_prediction', 'sentence_prediction_adapters', 'simul_text_to_text', 'translation_from_pretrained_xlm', 'translation_lev', 'dummy_lm', 'dummy_masked_lm'])
juangea commented 2 weeks ago

The antivirus is deleting the file saying it's

https://www.superantispyware.com/malwarefiles/x86_64-w64-mingw32-gcc-ranlib.exe.html

It won't let me run this on my system, is there some other solution?

blaisewf commented 2 weeks ago

disable your antivirus