xverse-ai / XVERSE-13B

XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
Apache License 2.0
648 stars 58 forks source link

XVERSE-13B-256K WebDemo启动失败,烦请指导一下。 #29

Closed ysyx2008 closed 3 months ago

ysyx2008 commented 7 months ago

启动提示:

(xverse) yushen@YuShen-Work:~/ai/XVERSE-13B$ python chat_demo.py --port='7860' --model_path='/home/yushen/ai/XVERSE-13B-256K' --tokenizer_path='/home/yushen/ai/XVERSE-13B-256K'

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin /home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so /home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. warn("The installed version of bitsandbytes was compiled without GPU support. " /home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: /home/yushen/anaconda3/envs/xverse did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths... warn(msg) CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... CUDA exception! Error code: no CUDA-capable device is detected CUDA exception! Error code: initialization error CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so /home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: WARNING: No GPU detected! Check your CUDA paths. Proceeding to load CPU-only library... warn(msg) CUDA SETUP: Detected CUDA version 122 CUDA SETUP: Loading binary /home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so... Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 15/15 [00:28<00:00, 1.87s/it] Traceback (most recent call last): File "/home/yushen/ai/XVERSE-13B/chat_demo.py", line 116, in demo.queue(concurrency_count=4) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/gradio/blocks.py", line 1715, in queue raise DeprecationWarning( DeprecationWarning: concurrency_count has been deprecated. Set the concurrency_limit directly on event listeners e.g. btn.click(fn, ..., concurrency_limit=10) or gr.Interface(concurrency_limit=10). If necessary, the total number of workers can be configured via max_threads in launch().

依赖文件已经按操作指引安装,环境为双卡4090主机。

ysyx2008 commented 7 months ago

是我的torch版本太高了?我先试试看。

ysyx2008 commented 7 months ago

已解决:

修改chat_demo.py文件,修改倒数第二行:

demo.queue(concurrency_count=4)

demo.queue()

即可正常启动。

ysyx2008 commented 7 months ago

虽然能启动了,但是任何对话都会报错:

Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample outputs = self( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 715, in forward outputs = self.model( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 603, in forward layer_outputs = decoder_layer( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 249, in forward assert not use_cache, "use_cache is not supported" AssertionError: use_cache is not supported

seanxuu commented 7 months ago

虽然能启动了,但是任何对话都会报错:

Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample outputs = self( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 715, in forward outputs = self.model( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 603, in forward layer_outputs = decoder_layer( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 249, in forward assert not use_cache, "use_cache is not supported" AssertionError: use_cache is not supported

我也是这样

Uchihayht commented 6 months ago

虽然能启动了,但是任何对话都会报错:

Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample outputs = self( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 715, in forward outputs = self.model( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 603, in forward layer_outputs = decoder_layer( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 249, in forward assert not use_cache, "use_cache is not supported" AssertionError: use_cache is not supported

me too

ysyx2008 commented 4 months ago

开源生态不行啊

miange91 commented 4 months ago

虽然能启动了,但是任何对话都会报错:

Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample outputs = self( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 715, in forward outputs = self.model( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 603, in forward layer_outputs = decoder_layer( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 249, in forward assert not use_cache, "use_cache is not supported" AssertionError: use_cache is not supported

这里主要是pytorch和transformer的版本升级导致的,256k的modeling_xverse.py用了xformers的flash attention,请把第249行屏蔽了,应该就可以了。

miange91 commented 4 months ago

虽然能启动了,但是任何对话都会报错: Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample outputs = self( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 715, in forward outputs = self.model( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 603, in forward layer_outputs = decoder_layer( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 249, in forward assert not use_cache, "use_cache is not supported" AssertionError: use_cache is not supported

我也是这样

这里主要是pytorch和transformer的版本升级导致的,256k的modeling_xverse.py用了xformers的flash attention,请把第249行屏蔽了,应该就可以了。

miange91 commented 4 months ago

虽然能启动了,但是任何对话都会报错: Exception in thread Thread-7 (generate): Traceback (most recent call last): File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate return self.sample( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample outputs = self( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 715, in forward outputs = self.model( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 603, in forward layer_outputs = decoder_layer( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 311, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/yushen/anaconda3/envs/xverse/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/yushen/.cache/huggingface/modules/transformers_modules/XVERSE-13B-256K/modeling_xverse.py", line 249, in forward assert not use_cache, "use_cache is not supported" AssertionError: use_cache is not supported

me too

这里主要是pytorch和transformer的版本升级导致的,256k的modeling_xverse.py用了xformers的flash attention,请把第249行屏蔽了,应该就可以了。