Closed chenyangMl closed 1 day ago
将tiktoken 0.4.0 升级到tiktoken 0.7.0后解决了上述问题。 E1: index out of bounds: the len is 1 but the index is 1
出现新的报错E2如下:
发生异常: ValueError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
too many values to unpack (expected 2)
File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/modeling_chatglm.py", line 562, in forward
cache_k, cache_v = kv_cache
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward
output = module._old_forward(*args, kwargs)
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/modeling_chatglm.py", line 693, in forward
attention_output, kv_cache = self.self_attention(
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward
output = module._old_forward(args, kwargs)
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/modeling_chatglm.py", line 790, in forward
layer_ret = layer(
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/modeling_chatglm.py", line 1056, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/modeling_chatglm.py", line 1188, in forward
transformer_outputs = self.transformer(
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward
output = module._old_forward(*args, kwargs)
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample
outputs = self(
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate
result = self._sample(
File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, kwargs)
File "/mnt1/works/mm-llm/glm-4v-9b-Demo/test.py", line 31, in
降低transformers 到4.40.2,请严格按照req装
降低transformers 到4.40.2,请严格按照req装
多谢解答,降低transformers==4.40.2后可以正常运行了。
我是直接从modelscopre上开始的,没看到requirements.txt。建议里面也加一个,或者readme里面加一个链接也好,减少版本带来的不必要麻烦。
https://github.com/THUDM/GLM-4/blob/main/basic_demo/requirements.txt
System Info / 系統信息
transformers 4.42.3 modelscope 1.16.0 tiktoken 0.4.0 Python 3.10.11 Build cuda_11.3.r11.3/compiler.29745058_0
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
import torch from PIL import Image from modelscope import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
··· tokenizer = AutoTokenizer.from_pretrained("ZhipuAI/glm-4v-9b", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( "ZhipuAI/glm-4v-9b", torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, trust_remote_code=True, device_map="auto" ) model = model.eval()
query = '描述这张图片' image = Image.open("kobe.jpeg").convert('RGB') inputs = tokenizer.apply_chat_template([{"role": "user", "image": image, "content": query}], add_generation_prompt=True, tokenize=True, return_tensors="pt", return_dict=True) # chat mode
inputs = inputs.to(device)
gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1} with torch.no_grad(): outputs = model.generate(inputs, gen_kwargs) outputs = outputs[:, inputs['input_ids'].shape[1]:] print(tokenizer.decode(outputs[0])) ··· 运行代码来自 https://modelscope.cn/models/ZhipuAI/glm-4v-9b
报错信息:
index out of bounds: the len is 1 but the index is 1 File "/home/xxx/.conda/envs/llm310/lib/python3.10/site-packages/tiktoken/core.py", line 120, in encode return self._core_bpe.encode(text, allowed_special) File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/tokenization_chatglm.py", line 140, in build_single_message role_tokens = [self.convert_tokens_to_ids(f"<|{role}|>")] + self.tokenizer.encode(f"{metadata}\n", File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/tokenization_chatglm.py", line 216, in handle_single_conversation input = self.build_single_message( File "/home/xxx/.cache/huggingface/modules/transformers_modules/glm-4v-9b/tokenization_chatglm.py", line 236, in apply_chat_template result = handle_single_conversation(conversation) File "/mnt1/works/mm-llm/glm-4v-9b-Demo/test.py", line 22, in
inputs = tokenizer.apply_chat_template([{"role": "user", "image": image, "content": query}],
File "/home/xxx/.conda/envs/llm310/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/xxx/.conda/envs/llm310/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
pyo3_runtime.PanicException: index out of bounds: the len is 1 but the index is 1
Expected behavior / 期待表现
希望上述示例可以正确运行。