intel-analytics / text-generation-webui

A Gradio Web UI for running local LLM on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) using IPEX-LLM.
GNU Affero General Public License v3.0
14 stars 8 forks source link

Llama2-7b doesn't support transformers >=4.38 #31

Closed hkvision closed 1 month ago

hkvision commented 5 months ago
Traceback (most recent call last):
  File "/home/arda/kai/webui/text-generation-webui/modules/callbacks.py", line 61, in gentask
    ret = self.mfunc(callback=_callback, *args, **self.kwargs)
  File "/home/arda/kai/webui/text-generation-webui/modules/text_generation.py", line 392, in generate_with_callback
    shared.model.generate(**kwargs)
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/transformers/generation/utils.py", line 1592, in generate
    return self.sample(
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/transformers/generation/utils.py", line 2696, in sample
    outputs = self(
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1176, in forward
    outputs = self.model(
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/anaconda3/envs/text-webui-upstream/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
TypeError: llama_model_forward_4_36() got an unexpected keyword argument 'cache_position'
Output generated in 0.17 seconds (0.00 tokens/s, 0 tokens, context 72, seed 1344122438)

One week ago they have upgraded to 4.39... https://github.com/oobabooga/text-generation-webui/commit/3ce0d9221b1a0549135cbf3eb81a7bc5b1d64408

hkvision commented 5 months ago

Mistral works. Using Mistral to test for transformers 4.38. To run llama, currently need to downgrade transformers to 4.37.

cc @jason-dai @sgwhat @shane-huang

github-actions[bot] commented 3 months ago

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] commented 2 months ago

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] commented 1 month ago

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.