Closed lei-sun-intel closed 1 week ago
It looks like transformers' version is not match, it requires 4.36, which version are you using? Please change to the right version.
pip list |grep transformers shows we are working on transformers 4.37.0, I will try 4.36. Thanks a lot for your quick reply
after pip install transformers==4.36.0 or 4.36.2, I got the same error, no change at all.
I just try this model, and get a different error:
2024-06-24 14:13:06,027 - INFO - Converting the current model to sym_int4 format......
torch.Size([3, 448, 448])
Traceback (most recent call last):
File "C:\Users\arda\xin\test.py", line 21, in <module>
res, context, _ = model.chat(
^^^^^^^^^^^
File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 279, in chat
res, vision_hidden_states = self.generate(
^^^^^^^^^^^^^^
File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 230, in generate
model_inputs['inputs_embeds'], vision_hidden_states = self.get_vllm_embedding(model_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 87, in get_vllm_embedding
vision_hidden_states.append(self.get_vision_embedding(pixel_values))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 75, in get_vision_embedding
vision_embedding = self.vpm.forward_features(pixel_value.unsqueeze(0).type(dtype))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\timm\models\vision_transformer.py", line 663, in forward_features
x = self._pos_embed(x)
^^^^^^^^^^^^^^^^^^
File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\timm\models\vision_transformer.py", line 582, in _pos_embed
pos_embed = resample_abs_pos_embed(
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\timm\layers\pos_embed.py", line 46, in resample_abs_pos_embed
posemb = F.interpolate(posemb, size=new_size, mode=interpolation, antialias=antialias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\torch\nn\functional.py", line 4027, in interpolate
return torch._C._nn._upsample_bicubic2d_aa(input, output_size, align_corners, scale_factors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Could not run 'aten::_upsample_bicubic2d_aa.out' with arguments from the 'XPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_upsample_bicubic2d_aa.out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
It shows timm could't run on xpu.
patch the file as follows: posemb = F.interpolate(posemb, size=new_size, mode=interpolation, antialias=antialias) -> posemb = F.interpolate(posemb.to("cpu"), size=new_size, mode=interpolation, antialias=antialias).to(posemb.device) can fix your problem.
Yes, you can follow https://github.com/intel-analytics/ipex-llm/issues/10470. I just run the generation successfully.
Would you please help check the version of bigdl-llm? I just tried bigdl-llm==2.4.0, and it does NOT work either. In my env, $ pip list |grep bigdl shows as follows. bigdl-core-xe-21 2.5.0b20240610 bigdl-core-xe-addons-21 2.5.0b20240610 bigdl-core-xe-batch-21 2.5.0b20240610 bigdl-core-xe-esimd-21 2.5.0b20240423 bigdl-llm 2.4.0
I checked https://github.com/intel-analytics/ipex-llm/issues/10470, bigdl-llm==2.5.0b20240318, I will have a try.
Can I do it with ipex--llm instead of bigdl-llm? Because I find I have NOT installed bigdl-llm.
Please don't use bigdl-llm, bigdl-llm has now become ipex-llm (see the migration guide here) My python script
import torch
from PIL import Image
from ipex_llm.transformers import AutoModel
from transformers import AutoTokenizer
path = "D:\\llm-models\\MiniCPM-V"
print(path)
model = AutoModel.from_pretrained(path,
load_in_4bit=True,
optimize_model=False,
trust_remote_code=True,
modules_to_not_convert=["vpm", "resampler"],
use_cache=True)
model = model.float().to(device='xpu')
tokenizer = AutoTokenizer.from_pretrained(path,
trust_remote_code=True)
model.eval()
image = Image.open("C:\\Users\\arda\Desktop\\tiger.jpeg").convert('RGB')
question = 'What is in the image?'
msgs = [{'role': 'user', 'content': question}]
res, context, _ = model.chat(
image=image,
msgs=msgs,
context=None,
tokenizer=tokenizer,
sampling=True,
temperature=0.7
)
print(res)
I want to convert the model to int4, Any update?
model = AutoModel.from_pretrained(path,
load_in_4bit=True,
optimize_model=False,
trust_remote_code=True,
modules_to_not_convert=["vpm", "resampler"],
use_cache=True)
With upon code, it's already loading into int4.
@lei-sun-intel You error message is throw during the generation. The model should be saved already. Traceback (most recent call last): File "/home/test10/ipex-llm/python/llm/example/GPU/ModelScope-Models/Save-Load/./generate.py", line 66, in output = model.generate(input_ids, File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 211, in generate model_inputs = self._process_list(tokenizer, data_list, max_inp_length) File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 167, in _process_list input_tensors.append(self._convert_to_tensors(tokenizer, data, max_inp_length)) File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 138, in _convert_to_tensors if tokenizer.add_bos_token: AttributeError: 'NoneType' object has no attribute 'add_bos_token'
Tokenizer has attribute no add_bos_token
, it's an error for transformers version mismatch. But you said you have installed the right version. I think it's something wrong in your environment.
My suggestion: you can create a new conda environment and try my code above.
path = "./models/OpenBMB/MiniCPM-V" save_path = "./models/OpenBMB/MiniCPM-V-int4"
model = AutoModel.from_pretrained(path, load_in_4bit=True, optimize_model=False, trust_remote_code=True, modules_to_not_convert=["vpm", "resampler"], use_cache=True) model = model.float().to(device='xpu') tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True) model.eval()
model.save_low_bit(save_path) tokenizer.save_pretrained(save_path) print(f"Model and tokenizer are saved to {save_path}")
Finally, the above code fix the problem. Thanks a lot!
2024-06-21 16:26:41,934 - INFO - intel_extension_for_pytorch auto imported 2024-06-21 16:26:42,721 - INFO - Note: NumExpr detected 22 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. 2024-06-21 16:26:42,721 - INFO - NumExpr defaulting to 8 threads. 2024-06-21 16:26:43,577 - modelscope - INFO - PyTorch version 2.1.0.post0+cxx11.abi Found. 2024-06-21 16:26:43,577 - modelscope - INFO - Loading ast index from /home/test10/.cache/modelscope/ast_indexer 2024-06-21 16:26:43,649 - modelscope - INFO - Loading done! Current index file version is 1.11.0, with md5 a8fc9e89d1b75747da2b763359929bfe and a total number of 953 components indexed Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 8.98it/s] 2024-06-21 16:26:44,791 - INFO - Converting the current model to sym_int4 format...... WARNING: Ignoring invalid distribution -orch (/home/test10/.miniconda_dev_zone/envs/notebook-zone/lib/python3.9/site-packages) Model and tokenizer are saved to ./models/OpenBMB/MiniCPM-V-int4 Traceback (most recent call last): File "/home/test10/ipex-llm/python/llm/example/GPU/ModelScope-Models/Save-Load/./generate.py", line 66, in
output = model.generate(input_ids,
File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 211, in generate
model_inputs = self._process_list(tokenizer, data_list, max_inp_length)
File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 167, in _process_list
input_tensors.append(self._convert_to_tensors(tokenizer, data, max_inp_length))
File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 138, in _convert_to_tensors
if tokenizer.add_bos_token:
AttributeError: 'NoneType' object has no attribute 'add_bos_token'
Python 3.9.19 Model: MiniCPM-V https://modelscope.cn/models/OpenBMB/MiniCPM-V