rsxdalv / tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
https://rsxdalv.github.io/tts-generation-webui/
MIT License
1.46k stars 160 forks source link

IndexError: index 4 is out of range error when checked Multi-band Diffusion #275

Closed mykeehu closed 4 months ago

mykeehu commented 4 months ago

On the MusicGen+AudioGen tab, if I check the Use Multi-band Diffusion checkbox, I get this error when generating:

Generating: ''' a romantic pop walz instrumental in 90's music, melody. 3/4 100bpm 320kbps 48khz Stereo '''
Parameters:
text : a romantic pop walz instrumental in 90's music, melody. 3/4 100bpm 320kbps 48khz Stereo
melody : None
model : facebook/musicgen-stereo-melody
duration : 10
topk : 500
topp : 0
temperature : 1
cfg_coef : 3
seed : 1533664423
use_multi_band_diffusion : True
Generated in 11.091 seconds
Traceback (most recent call last):
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\gradio\queueing.py", line 407, in call_prediction
    output = await route_utils.call_process_api(
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\gradio\route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\gradio\blocks.py", line 1550, in process_api
    result = await self.call_function(
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\gradio\blocks.py", line 1185, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py", line 661, in wrapper
    response = f(*args, **kwargs)
  File "I:\tts-generation-webui\tts-generation-webui\src\musicgen\musicgen_tab.py", line 210, in generate
    wav_diffusion = mbd.tokens_to_wav(tokens)
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\audiocraft\models\multibanddiffusion.py", line 188, in tokens_to_wav
    wav_encodec = self.codec_model.decode(tokens)
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\audiocraft\models\encodec.py", line 251, in decode
    emb = self.decode_latent(codes)
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\audiocraft\models\encodec.py", line 259, in decode_latent
    return self.quantizer.decode(codes)
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\audiocraft\quantization\vq.py", line 102, in decode
    quantized = self.vq.decode(codes)
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\audiocraft\quantization\core_vq.py", line 402, in decode
    layer = self.layers[i]
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\torch\nn\modules\container.py", line 295, in __getitem__
    return self._modules[self._get_abs_string_index(idx)]
  File "I:\tts-generation-webui\installer_files\env\lib\site-packages\torch\nn\modules\container.py", line 285, in _get_abs_string_index
    raise IndexError('index {} is out of range'.format(idx))
IndexError: index 4 is out of range

What is the problem?

rsxdalv commented 4 months ago

Hi, thank you for the report!

I have not seen this before so I will need to do some research. Does it work without MBD for you?

I'm worried that MBD + Stereo might be broken at the moment.

rsxdalv commented 4 months ago

Ok, I have confirmed that it will not work in the current configuration without a patch from me. https://github.com/facebookresearch/audiocraft/issues/401 https://github.com/replicate/cog-musicgen/issues/9

rsxdalv commented 4 months ago

I created a patch on the branch https://github.com/rsxdalv/tts-generation-webui/tree/mbd-stereo I am unable to test it yet, but if you would like to test it, let me know. Once I have tested it I will be able to include it in the repo.

rsxdalv commented 4 months ago

I tested and finalized the code, it should just work now. Please let me know if you experience any more problems.

mykeehu commented 4 months ago

Works perfectly, thanks for the quick fix!