Open SlyRebula opened 1 month ago
Hi, we are trying to reproduce your issue.
Hi @SlyRebula , we tried several times but couldn't reproduce your problem. We got our inference times around 11s.
(bark) arda@arda-arc01:~/zijie/bark$ python ./synthesize_speech.py --repo-id-or-model-path /mnt/disk1/models/bark-small --text 'IPEX-LLM is a library for running large language model on Intel XPU with very low latency.'
/home/arda/miniforge3/envs/bark/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
warn(
2024-08-01 14:28:44,913 - INFO - intel_extension_for_pytorch auto imported
/home/arda/miniforge3/envs/bark/lib/python3.11/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
2024-08-01 14:28:45,742 - INFO - Converting the current model to sym_int4 format......
/home/arda/miniforge3/envs/bark/lib/python3.11/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:10000 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:10000 for open-end generation.
Inference time: 11.526452779769897 s
You may check your environment or make sure there's no other process running. Feel free to reach out if you still have problems.
hello i am attempting to create text to speech with bark on intel a770 but it takes around 60 seconds to generate audio is that normal ? is there a way to make it faster like few seconds ? https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/bark
(phytia2) C:\phytia\Phytia>python ./synthesize_speech.py --text "IPEX-LLM is a library for running large language model on Intel XPU with very low latency." C:\Users\SlyRebula\miniconda3\envs\phytia2\Lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\Users\SlyRebula\miniconda3\envs\Phytia2\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from
torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have
libjpegor
libpnginstalled before building
torchvisionfrom source? warn( 2024-07-31 13:47:18,476 - INFO - intel_extension_for_pytorch auto imported C:\Users\SlyRebula\miniconda3\envs\phytia2\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning:
resume_downloadis deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use
force_download=True. warnings.warn( C:\Users\SlyRebula\miniconda3\envs\phytia2\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") 2024-07-31 13:47:22,731 - INFO - Converting the current model to sym_int4 format...... The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_maskto obtain reliable results. Setting
pad_token_idto
eos_token_id:10000 for open-end generation. The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_maskto obtain reliable results. Setting
pad_token_idto
eos_token_id:10000 for open-end generation. Inference time: 54.660537242889404 s