Stability-AI / stable-audio-tools

Generative models for conditional audio generation
MIT License
2.57k stars 243 forks source link

Init audio conditioning doesn't work with --model-half #80

Open IIPedro opened 4 months ago

IIPedro commented 4 months ago

When starting the gradio interface on Windows with --model-half, both init audio and audio inpaint result in an error regarding tensors and halftensors. I've properly followed the guidelines on the repo readme to install the library. Everything else works perfectly fine!

Here's the command used to start the interface:

python run_gradio.py --ckpt-path model.ckpt --model-config model_config.json --model-half

Here's the error log:

Traceback (most recent call last): File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\queueing.py", line 521, in process_events response = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\route_utils.py", line 276, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\blocks.py", line 1935, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\blocks.py", line 1513, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_backends\_asyncio.py", line 2177, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_backends\_asyncio.py", line 859, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\gradio\utils.py", line 832, in wrapper response = f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "C:\Users\User\Documents\AI\stable-audio-tools\stable_audio_tools\interface\gradio.py", line 167, in generate_cond audio = generate_diffusion_cond( ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\Documents\AI\stable-audio-tools\stable_audio_tools\inference\generation.py", line 179, in generate_diffusion_cond init_audio = model.pretransform.encode(init_audio) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\Documents\AI\stable-audio-tools\stable_audio_tools\models\pretransforms.py", line 56, in encode encoded = self.model.encode_audio(x, chunked=self.chunked, iterate_batch=self.iterate_batch, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\Documents\AI\stable-audio-tools\stable_audio_tools\models\autoencoders.py", line 445, in encode_audio return self.encode(audio, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\Documents\AI\stable-audio-tools\stable_audio_tools\models\autoencoders.py", line 302, in encode latents.append(self.encoder(audio[i:i+1])) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\Documents\AI\stable-audio-tools\stable_audio_tools\models\autoencoders.py", line 147, in forward return self.layers(x) ^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\container.py", line 215, in forward input = module(input) ^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1568, in _call_impl result = forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\conv.py", line 310, in forward return self._conv_forward(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\conv.py", line 306, in _conv_forward return F.conv1d(input, weight, bias, self.stride, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

ghost commented 3 months ago

Getting the same error here on Windows 10

chickenbox commented 6 days ago

I managed to make in work by making this code change

Screenshot 2024-09-30 115237