JarodMica / ai-voice-cloning

GNU General Public License v3.0
471 stars 97 forks source link

Installed tts 2.0 but got this error #24

Closed Aliasthump closed 5 months ago

Aliasthump commented 6 months ago

Windows - NVIDIA 3050

"Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: Workspace can't be allocated, no enough memory."

Here's the whole thing.

C:\ai-voice-cloning-v2_0\ai-voice-cloning>set PYTHONUTF8=1

C:\ai-voice-cloning-v2_0\ai-voice-cloning>runtime\python.exe .\src\main.py 2024-01-24 12:40:24 | INFO | rvc.configs.config | Found GPU NVIDIA GeForce RTX 3050 Ti Laptop GPU Whisper detected Traceback (most recent call last): File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\utils.py", line 98, in from vall_e.emb.qnt import encode as valle_quantize ModuleNotFoundError: No module named 'vall_e'

Traceback (most recent call last): File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\utils.py", line 118, in import bark ModuleNotFoundError: No module named 'bark'

[textbox, textbox, radio, textbox, dropdown, audio, number, slider, number, slider, slider, slider, radio, slider, slider, slider, slider, slider, slider, slider, checkboxgroup, checkbox, checkbox] [dropdown, slider, dropdown, slider, slider, slider, slider, slider] Running on local URL: http://127.0.0.1:7861

To create a public link, set share=True in launch(). Loading TorToiSe... (AR: ./training/CoolGuy/finetune/models/26_gpt.pth, diffusion: ./models/tortoise/diffusion_decoder.pth, vocoder: bigvgan_24khz_100band) Hardware acceleration found: cuda use_deepspeed api_debug True C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") Loading tokenizer JSON: ./modules/tortoise-tts/tortoise/data/tokenizer.json Loaded tokenizer Loading autoregressive model: ./training/CoolGuy/finetune/models/26_gpt.pth [2024-01-24 12:50:02,448] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.8.3+6eca037c, git-hash=6eca037c, git-branch=HEAD [2024-01-24 12:50:02,448] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead [2024-01-24 12:50:02,448] [INFO] [logging.py:96:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1 WARNING! Setting BLOOMLayerPolicy._orig_layer_class to None due to Exception: module 'transformers.models' has no attribute 'bloom' [2024-01-24 12:50:02,476] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed-Inference config: {'layer_id': 0, 'hidden_size': 1024, 'intermediate_size': 4096, 'heads': 16, 'num_hidden_layers': -1, 'fp16': False, 'pre_layer_norm': True, 'local_rank': -1, 'stochastic_mode': False, 'epsilon': 1e-05, 'mp_size': 1, 'q_int8': False, 'scale_attention': True, 'triangular_masking': True, 'local_attention': False, 'window_size': 1, 'rotary_dim': -1, 'rotate_half': False, 'rotate_every_two': True, 'return_tuple': True, 'mlp_after_attn': True, 'mlp_act_func_type': <ActivationFuncType.GELU: 1>, 'specialized_mode': False, 'training_mp_size': 1, 'bigscience_bloom': False, 'max_out_tokens': 1024, 'scale_attn_by_inverse_layer_idx': False, 'enable_qkv_quantization': False, 'use_mup': False, 'return_single_tuple': False} Loaded autoregressive model Loaded diffusion model Loading vocoder model: bigvgan_24khz_100band Loading vocoder model: bigvgan_24khz_100band.pth Removing weight norm... Loaded vocoder model Loaded TTS, ready for generation. 2024-01-24 13:40:01 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/api/predict "HTTP/1.1 200 OK" 2024-01-24 13:40:01 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/api/predict "HTTP/1.1 200 OK" 2024-01-24 13:40:01 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/reset "HTTP/1.1 200 OK" 2024-01-24 13:40:01 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/reset "HTTP/1.1 200 OK" 2024-01-24 13:40:23 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/api/predict "HTTP/1.1 200 OK" 2024-01-24 13:40:23 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/reset "HTTP/1.1 200 OK" 2024-01-24 13:40:51 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/api/predict "HTTP/1.1 200 OK" [1/1] Generating line: This is a test to see if my voice is back. Loading voice: CoolGuy with model c3c14d84 Loading voice: CoolGuy 2024-01-24 13:40:51 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/reset "HTTP/1.1 200 OK" Reading from latent: ./voices/CoolGuy//cond_latents_c3c14d84.pth {'temperature': 0.2, 'top_p': 0.8, 'diffusion_temperature': 1.0, 'length_penalty': 1.0, 'repetition_penalty': 2.0, 'cond_free_k': 2.0, 'num_autoregressive_samples': 2, 'sample_batch_size': 1, 'diffusion_iterations': 30, 'voice_samples': None, 'conditioning_latents': (tensor([[-1.7025, 0.4967, 0.6810, ..., -3.8491, -1.0170, 0.4766]]), tensor([[-0.9459, -1.1040, -0.7465, ..., -0.0278, -0.0548, 0.2184]])), 'use_deterministic_seed': None, 'return_deterministic_state': True, 'k': 1, 'diffusion_sampler': 'DDIM', 'breathing_room': 8, 'half_p': False, 'cond_free': True, 'cvvp_amount': 0, 'autoregressive_model': './training/CoolGuy/finetune/models/26_gpt.pth', 'diffusion_model': './models/tortoise/diffusion_decoder.pth', 'tokenizer_json': './modules/tortoise-tts/tortoise/data/tokenizer.json'} Requested: 104890368 Free: 0 Total: 4294443008 Traceback (most recent call last): File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\utils.py", line 1235, in generate_tortoise gen, additionals = tts.tts(cut_text, settings ) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\tortoise\api.py", line 746, in tts codes = self.autoregressive.inference_speech(auto_conditioning, text_tokens, File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\tortoise\models\autoregressive.py", line 560, in inference_speech gen = self.inference_model.generate(inputs, bos_token_id=self.start_mel_token, pad_token_id=self.stop_mel_token, eos_token_id=self.stop_mel_token, File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\transformers\generation_utils.py", line 1310, in generate return self.sample( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\transformers\generation_utils.py", line 1926, in sample outputs = self( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\tortoise\models\autoregressive.py", line 150, in forward transformer_outputs = self.transformer( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 889, in forward outputs = block( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\deepspeed\model_implementations\transformers\ds_transformer.py", line 114, in forward self.allocate_workspace(self.config.hidden_size, self.config.heads, RuntimeError: Workspace can't be allocated, no enough memory.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\gradio\routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\gradio\blocks.py", line 1075, in process_api result = await self.call_function( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\gradio\blocks.py", line 884, in call_function prediction = await anyio.to_thread.run_sync( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\runtime\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn response = fn(args) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\webui.py", line 129, in generate_proxy raise e File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\webui.py", line 123, in generate_proxy sample, outputs, stats = generate(kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\utils.py", line 364, in generate return generate_tortoise(kwargs) File "C:\ai-voice-cloning-v2_0\ai-voice-cloning\src\utils.py", line 1238, in generate_tortoise raise RuntimeError(f'Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: {e}') RuntimeError: Possible latent mismatch: click the "(Re)Compute Voice Latents" button and then try again. Error: Workspace can't be allocated, no enough memory. 2024-01-24 13:40:51 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/api/predict "HTTP/1.1 500 Internal Server Error" 2024-01-24 13:40:51 | INFO | httpx | HTTP Request: POST http://127.0.0.1:7861/reset "HTTP/1.1 200 OK"

JarodMica commented 6 months ago

Error: Workspace can't be allocated, no enough memory

Is it 4gb or 8gb on the 3050? You're running out of memory on inference here. There's an option to enable "Low VRAM" in the settings tab, can you turn that on and see if you can generate?

Aliasthump commented 6 months ago

I’m sure it’s the memory. I’ll try your fix.

On Jan 24, 2024, at 5:41 PM, Jarod Mica @.***> wrote:

 Error: Workspace can't be allocated, no enough memory

Is it 4gb or 8gb on the 3050? You're running out of memory on inference here. There's an option to enable "Low VRAM" in the settings tab, can you turn that on and see if you can generate?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

JarodMica commented 5 months ago

If this issue is not resolved, feel free to reopen this back up. Closing for now