Closed aboutmydreams closed 1 week ago
@aboutmydreams see if simply replace this in pyproject.toml will work
"bitsandbytes>0.37.0; platform_machine != 'arm64' and platform_system != 'Darwin'"
@aboutmydreams see if simply replace this in pyproject.toml will work
"bitsandbytes>0.37.0; platform_machine != 'arm64' and platform_system != 'Darwin'"
@SWivid Yes, this is also possible, but you also need to use this .apple_silicon.env and add "python-dotenv" to pyproject.toml. I have updated the code and documentation in this part.
but you also need to use this .apple_silicon.env and add "python-dotenv" to pyproject.toml.
Maybe you could explain more the the other parts of modifications? I'm not that familiar with apple silicon dev env, and I'm not quite convinced of introducing much new files for what kind of usage.
e.g. what issue you've got that https://github.com/SWivid/F5-TTS/blob/0f80f25c5fc95aed21a560bec22fed9d237948bf/src/f5_tts/infer/utils_infer.py#L36-L37 will not cover and need a toml env and lines of extra installation steps?
Download Vocos from huggingface charactr/vocos-mel-24khz
vocab : /Users/apple/coding/learn/F5-TTS/src/f5_tts/infer/examples/vocab.txt
token : custom
model : /Users/apple/.cache/huggingface/hub/models--SWivid--F5-TTS/snapshots/4dcc16f297f2ff98a17b3726b16f5de5a5e45672/F5TTS_Base/model_1200000.safetensors
Starting app...
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
ref_text Hi all, I come form china
gen_text 0 Hi all, good luck
Building prefix dict from the default dictionary ...
Loading model from cache /var/folders/6g/588r34dn1t381b5kfk3282r40000gn/T/jieba.cache
Loading model cost 0.379 seconds.
Prefix dict has been built successfully.
Traceback (most recent call last):
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/gradio/queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/gradio/route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/gradio/blocks.py", line 1935, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/gradio/blocks.py", line 1520, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 943, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/gradio/utils.py", line 826, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/src/f5_tts/infer/infer_gradio.py", line 217, in basic_tts
audio_out, spectrogram_path, ref_text_out = infer(
^^^^^^
File "/Users/apple/coding/learn/F5-TTS/src/f5_tts/infer/infer_gradio.py", line 136, in infer
final_wave, final_sample_rate, combined_spectrogram = infer_process(
^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/src/f5_tts/infer/utils_infer.py", line 366, in infer_process
return infer_batch_process(
^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/src/f5_tts/infer/utils_infer.py", line 451, in infer_batch_process
generated_wave = vocoder.decode(generated_mel_spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/vocos/pretrained.py", line 113, in decode
audio_output = self.head(x)
^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/vocos/heads.py", line 68, in forward
audio = self.istft(S)
^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/apple/coding/learn/F5-TTS/.venv/lib/python3.12/site-packages/vocos/spectral_ops.py", line 46, in forward
return torch.istft(spec, self.n_fft, self.hop_length, self.win_length, self.window, center=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: The operator 'aten::unfold_backward' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
Option 1
# from dotenv import load_dotenv
# import os
# load_dotenv()
# if os.getenv("PYTORCH_ENABLE_MPS_FALLBACK") == "1":
# print("You are using the version optimized for Apple silicon.")
Option 2
import os
import torch
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
if device == "mps":
print("You are using the version optimized for Apple silicon.")
os.environ["PYTOCH_ENABLE_MPS_FALLBACK"] = "1"
I found that when using option 2, it can be printed out, but os.environ["PYTOCH_ENABLE_MPS_FALLBACK"] = "1" does not take effect.
I think when we use script f5-tts_infer-gradio
In solution 2, although the prompt information can be printed successfully, os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] does not really take effect. This is because changes to os.environ usually only affect the current Python process and its subprocesses, and cannot dynamically change the behavior of already loaded libraries (such as PyTorch).
PyTorch is usually initialized based on environment variables when loading, so the timing of modifying environment variables is very important. If you modify environment variables after PyTorch has loaded, it may not have an impact on PyTorch's runtime configuration.
@aboutmydreams understood, thanks~
then will it make sense to just put os.environ["PYTOCH_ENABLE_MPS_FALLBACK"] = "1"
at first in utils_infer.py
, since infer_cli and infer_gradio not contain import torch
(which is intended as all func are organized in utils_infer.py)
i'm not sure if the pr version of env setup is lasting for speech_edit.py or api.py and stuff when launching a new cli terminal; if will lasting, thought it's a good way; or put it somewhere like .bashrc like linux is possible for mac device?
hi @SWivid Here’s my understanding:
Placing os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"
in utils_infer.py
This seems like a good approach since both infer_cli
and infer_gradio
rely on utils_infer.py
for all PyTorch-related functionality. This ensures that the environment variable is set early enough before PyTorch is loaded.
Regarding the persistency of environment variables
Dynamically setting os.environ
in the code only affects the current Python process and any child processes. It won’t persist across new CLI sessions or other independent scripts like speech_edit.py
or api.py
.
To make the variable globally available on macOS, users can add the line below to their .zshrc
(or .bashrc
depending on their shell):
export PYTORCH_ENABLE_MPS_FALLBACK=1
After saving the file, they should run source ~/.zshrc
to apply the changes. This way, any Python script launched in a new terminal will have access to this environment variable.
Let me know if this addresses your concerns!
Hi @aboutmydreams , yes fully agreed, see if the commit cb8ce3306d70dfbee0e7d2423cc7f06e1c2b9c60 works. Thanks again for this PR, we are completely not against any helpful contribution, just trying to make it clear and user-blind to keep project simple to use~