是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
[X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
[X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
在WebUI中进行任何输入(例如“Hello!”),程序在运行时报错且无输出。错误日志如下。
If you input anything in the WebUI (e.g. "Hello!"), the program reports the following error without any output. The error log is shown below.
(Qwen) fd@fd:~/makeover/Qwen-VL$ python web_demo_mm.py --checkpoint-path ~/.cache/huggingface/hub/models--Qwen--Qwen-VL-Chat/snapshots/f57cfbd358cb56b710d963669ad1bcfb44cdcdd8/
2024-06-03 02:46:51,324 - modelscope - INFO - PyTorch version 2.3.0 Found.
2024-06-03 02:46:51,324 - modelscope - INFO - Loading ast index from /home/fd/.cache/modelscope/ast_indexer
2024-06-03 02:46:51,335 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 44b8eafcb244e1a7318e2afee9a18c75 and a total number of 976 components indexed
The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Loading checkpoint shards: 100%|████████████████████████████████| 10/10 [00:02<00:00, 3.94it/s]
Running on local URL: http://127.0.0.1:8000
To create a public link, set `share=True` in `launch()`.
User: Hello!
Traceback (most recent call last):
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/queueing.py", line 521, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/blocks.py", line 1945, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/blocks.py", line 1525, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/utils.py", line 655, in async_iteration
return await iterator.__anext__()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/utils.py", line 648, in __anext__
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/utils.py", line 631, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/gradio/utils.py", line 814, in gen_wrapper
response = next(iterator)
^^^^^^^^^^^^^^
File "/home/fd/makeover/Qwen-VL/web_demo_mm.py", line 131, in predict
for response in model.chat_stream(tokenizer, message, history=history):
File "/home/fd/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 1021, in stream_generator
for token in self.generate_stream(
^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/transformers_stream_generator/main.py", line 208, in generate
] = self._prepare_attention_mask_for_generation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fd/.miniconda/envs/Qwen/lib/python3.11/site-packages/transformers/generation/utils.py", line 473, in _prepare_attention_mask_for_generation
torch.isin(elements=inputs, test_elements=pad_token_id).any()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: isin() received an invalid combination of arguments - got (test_elements=int, elements=Tensor, ), but expected one of:
* (Tensor elements, Tensor test_elements, *, bool assume_unique, bool invert, Tensor out)
* (Number element, Tensor test_elements, *, bool assume_unique, bool invert, Tensor out)
* (Tensor elements, Number test_element, *, bool assume_unique, bool invert, Tensor out)
期望行为 | Expected Behavior
WebUI能够不报错并给出正常输出。
WebUI can provide normal output without this error.
复现方法 | Steps To Reproduce
使用conda安装环境。我自己整理的environment.yaml文件如下所示:
Use conda to install the environment. The environment.yaml file I made myself is listed below:
在web_demo_mm.py文件中,添加下面的代码可以初步解决这个问题。如果社区认为这个方案可行,我很愿意提供pull request.
In the web_demo_mm.py file, adding the following code can preliminarily solve this problem. If the community thinks this solution good enough, I am willing to provide a pull request.
def _load_model_tokenizer(args):
tokenizer = AutoTokenizer.from_pretrained(
args.checkpoint_path, trust_remote_code=True, resume_download=True, revision='master',
)
if args.cpu_only:
device_map = "cpu"
else:
device_map = "cuda"
model = AutoModelForCausalLM.from_pretrained(
args.checkpoint_path,
device_map=device_map,
trust_remote_code=True,
resume_download=True,
revision='master',
).eval()
model.generation_config = GenerationConfig.from_pretrained(
args.checkpoint_path, trust_remote_code=True, resume_download=True, revision='master',
)
+ if model.generation_config.pad_token_id is not None:
+ model.generation_config.pad_token_id = torch.tensor(
+ [model.generation_config.pad_token_id], device=model.device
+ )
+ if model.generation_config.eos_token_id is not None:
+ model.generation_config.eos_token_id = torch.tensor(
+ [model.generation_config.eos_token_id], device=model.device
+ )
return model, tokenizer
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
在WebUI中进行任何输入(例如“Hello!”),程序在运行时报错且无输出。错误日志如下。 If you input anything in the WebUI (e.g. "Hello!"), the program reports the following error without any output. The error log is shown below.
期望行为 | Expected Behavior
WebUI能够不报错并给出正常输出。 WebUI can provide normal output without this error.
复现方法 | Steps To Reproduce
使用
conda
安装环境。我自己整理的environment.yaml
文件如下所示: Useconda
to install the environment. Theenvironment.yaml
file I made myself is listed below:运行
web_demo_mm.py
并与其对话。 Run theweb_demo_mm.py
and input anything.运行环境 | Environment
The other installed packages are listed below:
初步解决方案 | Preliminary Solution
在
web_demo_mm.py
文件中,添加下面的代码可以初步解决这个问题。如果社区认为这个方案可行,我很愿意提供pull request. In theweb_demo_mm.py
file, adding the following code can preliminarily solve this problem. If the community thinks this solution good enough, I am willing to provide a pull request.