huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
133.03k stars 26.54k forks source link

mps failure with tts: IndexError: tuple index out of range in pytorch_utils.py #33786

Open ajkessel opened 1 week ago

ajkessel commented 1 week ago

System Info

Who can help?

No response

Information

Tasks

Reproduction

I'm not sure if this is a transformers bug, a coqui-ai bug, or just a lack of mps support for what I'm trying to do.

Same result whether PYTORCH_ENABLE_MPS_FALLBACK is set or not.

Python code:

from TTS.api import TTS
tts = TTS(model_name='multi-dataset/xtts_v2/en',progress_bar=True).to('mps')
tts.tts_to_file( text = "The quick brown fox jumped over the lazy dog.", speaker='Annmarie Nele', language='en', file_path='out.wav')

result:

  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/TTS/api.py", line 334, in tts_to_file
    wav = self.tts(
          ^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/TTS/api.py", line 276, in tts
    wav = self.synthesizer.tts(
          ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/TTS/utils/synthesizer.py", line 386, in tts
    outputs = self.tts_model.synthesize(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/TTS/tts/models/xtts.py", line 412, in synthesize
    return self.inference(text, language, gpt_cond_latent, speaker_embedding, **settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/TTS/tts/models/xtts.py", line 541, in inference
    gpt_codes = self.gpt.generate(
                ^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/TTS/tts/layers/xtts/gpt.py", line 590, in generate
    gen = self.gpt_inference.generate(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/transformers/generation/utils.py", line 1829, in generate
    self._prepare_special_tokens(generation_config, kwargs_has_attention_mask, device=device)
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/transformers/generation/utils.py", line 1678, in _prepare_special_tokens
    and isin_mps_friendly(elements=eos_token_tensor, test_elements=pad_token_tensor).any()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adam/dev/.venv/lib/python3.11/site-packages/transformers/pytorch_utils.py", line 325, in isin_mps_friendly
    return elements.tile(test_elements.shape[0], 1).eq(test_elements.unsqueeze(1)).sum(dim=0).bool().squeeze()
                         ~~~~~~~~~~~~~~~~~~~^^^
IndexError: tuple index out of range

I've also reported this as issue 3998 on coqui-ai.

Expected behavior

Successful execution.

Swastik-Swarup-Dash commented 1 week ago

Hey@ajkessel I Think this can work

import torch
from TTS.api import TTS
from transformers import pytorch_utils
def patched_isin_mps_friendly(elements, test_elements):
    if test_elements.ndim == 0:
        test_elements = test_elements.unsqueeze(0)
    return elements.tile(test_elements.shape[0], 1).eq(test_elements.unsqueeze(1)).sum(dim=0).bool().squeeze()

pytorch_utils.isin_mps_friendly = patched_isin_mps_friendly

tts = TTS(model_name='multi-dataset/xtts_v2/en', progress_bar=True).to('mps')
tts.tts_to_file(
    text="The quick brown fox jumped over the lazy dog.",
    speaker='Annmarie Nele',
    language='en',
    file_path='out.wav'
)
markuskreitzer commented 1 week ago

It seems like everything I'm running on my Macbook Pro M1 with the transformers lib is broken now. I'm using Python 3.10. This patch fixes it! Thanks!!!

markuskreitzer commented 1 week ago

@ajkessel This seem to be broken for me on any of the official examples I've used for Llama and Qwen inference models.

ajkessel commented 1 week ago

I tried @Swastik-Swarup-Dash 's workaround, got this error:

NotImplementedError: The operator 'aten::upsample_linear1d.out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

To the extent it's relevant:

Model Name: iMac
Model Identifier: iMac20,2
Processor Name: 10-Core Intel Core i9
Processor Speed: 3.6 GHz
Number of Processors: 1
Total Number of Cores: 10
L2 Cache (per Core): 256 KB
L3 Cache: 20 MB
Hyper-Threading Technology: Enabled
Memory: 32 GB

Although at least with this workaround, setting PYTORCH_ENABLE_MPS_FALLBACK=1 does avoid the exception. It just looks like it's not using GPU at all then.

zachmayer commented 1 week ago

I'm seeing the same issue with MPS inference. CPU inference works fine.

@Swastik-Swarup-Dash — maybe you could make a pull request with your patch!

Swastik-Swarup-Dash commented 1 week ago

@zachmayer let me give a try

Swastik-Swarup-Dash commented 1 week ago

@ajkessel You can try this

export PYTORCH_ENABLE_MPS_FALLBACK=1

You can set the environment variable within your script using the os module:

import os
os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"

Run the TTS Model

tts = TTS(model_name='multi-dataset/xtts_v2/en', progress_bar=True).to('mps')
tts.tts_to_file(
    text="The quick brown fox jumped over the lazy dog.",
    file_path="output.wav"
)

Maybe this can work and make sure your macOs version is upto date MPS support is only available in macOS 12.3 and later. or if none works switch to Cuda

tts = TTS(model_name='multi-dataset/xtts_v2/en', progress_bar=True).to('cuda')
LysandreJik commented 6 days ago

cc @ArthurZucker

ajkessel commented 5 days ago

With transformers==4.45.1 and tortoise-tts, I get the same IndexError: tuple index out of range error.

With transformers==4.31.0 (the version requested by tortoise-tts), instead I get RuntimeError: MPS backend out of memory.

transformers-4.31.0-error.txt transformers-4.45.1-error.txt

ArthurZucker commented 4 days ago

cc @eustlb seems like self.stop_mel_token is None (not an MPS issue )

ajkessel commented 4 days ago

For what it's worth, all this same code works fine for me on a Windows box with cuda (both in Linux (WSL) and native Windows). So even if it's not a mps issue, it seems to be Mac-specific.

pistudios commented 2 days ago

Hey@ajkessel I Think this can work


import torch
from transformers import pytorch_utils
def patched_isin_mps_friendly(elements, test_elements):
    if test_elements.ndim == 0:
        test_elements = test_elements.unsqueeze(0)
    return elements.tile(test_elements.shape[0], 1).eq(test_elements.unsqueeze(1)).sum(dim=0).bool().squeeze()

pytorch_utils.isin_mps_friendly = patched_isin_mps_friendly

You’re a lifesaver! I’ve been struggling for the past few days with Florence 2 workflow on MPS, which suddenly stopped working, I encountered the same error, and using the method you provided to patch the pytorch_utils.isin_mps_friendly , I was able to solve it! Thank you so much!