Open kunci115 opened 2 months ago
should I run export PYTORCH_ENABLE_MPS_FALLBACK=1
first?
Traceback (most recent call last):
File "/Users/leaves/LLM/mini-omni-mac-support/server.py", line 79, in PYTORCH_ENABLE_MPS_FALLBACK=1
to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
should I run
export PYTORCH_ENABLE_MPS_FALLBACK=1
first?Traceback (most recent call last): File "/Users/leaves/LLM/mini-omni-mac-support/server.py", line 79, in fire.Fire(serve) File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/fire/core.py", line 143, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "/Users/leaves/LLM/mini-omni-mac-support/server.py", line 72, in serve OmniChatServer(ip, port=port, run_app=True, device=device) File "/Users/leaves/LLM/mini-omni-mac-support/server.py", line 19, in init* self.client.warm_up() File "/Users/leaves/LLM/mini-omni-mac-support/inference.py", line 461, in warmup for in self.run_AT_batch_stream(sample): File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/Users/leaves/LLM/mini-omni-mac-support/inference.py", line 486, in run_AT_batch_stream audio_feature, input_ids = get_input_ids_whisper_ATBatch(mel, leng, self.whispermodel, self.device) File "/Users/leaves/LLM/mini-omni-mac-support/inference.py", line 112, in get_input_ids_whisper_ATBatch audio_feature = whispermodel.embed_audio(mel)[0][:leng] File "/Users/leaves/LLM/mini-omni-mac-support/inference.py", line 380, in embed_audio generated_ids = whispermodel.generate( File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py", line 671, in generate ) = self.generate_with_fallback( File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py", line 832, in generate_with_fallback seek_outputs = super().generate( File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/transformers/generation/utils.py", line 1713, in generate self._prepare_special_tokens(generation_config, kwargs_has_attention_mask, device=device) File "/Users/leaves/.pyenv/versions/mini-omni/lib/python3.10/site-packages/transformers/generation/utils.py", line 1562, in _prepare_special_tokens and torch.isin(elements=eos_token_tensor, test_elements=pad_token_tensor).any() NotImplementedError: The operator 'aten::isin.Tensor_Tensor_out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on pytorch/pytorch#77764. As a temporary fix, you can set the environment variable
PYTORCH_ENABLE_MPS_FALLBACK=1
to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
no you don't need, just pip install and run the server.py
I just tried this but still get the very choppy/stuttering audio that I described here https://github.com/gpt-omni/mini-omni/issues/6
Is there any update to this cool project?