When convert MiniCPM-V model from ModelScope to low-bit, got error: AttributeError: 'NoneType' object has no attribute 'add_bos_token'

lei-sun-intel commented 3 weeks ago

Download MiniCPM-V model from ModelScope
Convert the mode to low-bit by the command in GPU/ModelScope-Models/Save-Load as follows. python ./generate.py --repo-id-or-model-path ./models/OpenBMB/MiniCPM-V --save-path ./models/OpenBMB/MiniCPM-V-int4

2024-06-21 16:26:41,934 - INFO - intel_extension_for_pytorch auto imported 2024-06-21 16:26:42,721 - INFO - Note: NumExpr detected 22 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. 2024-06-21 16:26:42,721 - INFO - NumExpr defaulting to 8 threads. 2024-06-21 16:26:43,577 - modelscope - INFO - PyTorch version 2.1.0.post0+cxx11.abi Found. 2024-06-21 16:26:43,577 - modelscope - INFO - Loading ast index from /home/test10/.cache/modelscope/ast_indexer 2024-06-21 16:26:43,649 - modelscope - INFO - Loading done! Current index file version is 1.11.0, with md5 a8fc9e89d1b75747da2b763359929bfe and a total number of 953 components indexed Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 8.98it/s] 2024-06-21 16:26:44,791 - INFO - Converting the current model to sym_int4 format...... WARNING: Ignoring invalid distribution -orch (/home/test10/.miniconda_dev_zone/envs/notebook-zone/lib/python3.9/site-packages) Model and tokenizer are saved to ./models/OpenBMB/MiniCPM-V-int4 Traceback (most recent call last): File "/home/test10/ipex-llm/python/llm/example/GPU/ModelScope-Models/Save-Load/./generate.py", line 66, in output = model.generate(input_ids, File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 211, in generate model_inputs = self._process_list(tokenizer, data_list, max_inp_length) File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 167, in _process_list input_tensors.append(self._convert_to_tensors(tokenizer, data, max_inp_length)) File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 138, in _convert_to_tensors if tokenizer.add_bos_token: AttributeError: 'NoneType' object has no attribute 'add_bos_token'

Python 3.9.19 Model: MiniCPM-V https://modelscope.cn/models/OpenBMB/MiniCPM-V

qiuxin2012 commented 3 weeks ago

It looks like transformers' version is not match, it requires 4.36, which version are you using? Please change to the right version.

lei-sun-intel commented 3 weeks ago

pip list |grep transformers shows we are working on transformers 4.37.0, I will try 4.36. Thanks a lot for your quick reply

lei-sun-intel commented 3 weeks ago

after pip install transformers==4.36.0 or 4.36.2, I got the same error, no change at all.

qiuxin2012 commented 3 weeks ago

I just try this model, and get a different error:

2024-06-24 14:13:06,027 - INFO - Converting the current model to sym_int4 format......
torch.Size([3, 448, 448])
Traceback (most recent call last):
  File "C:\Users\arda\xin\test.py", line 21, in <module>
    res, context, _ = model.chat(
                      ^^^^^^^^^^^
  File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 279, in chat
    res, vision_hidden_states = self.generate(
                                ^^^^^^^^^^^^^^
  File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 230, in generate
    model_inputs['inputs_embeds'], vision_hidden_states = self.get_vllm_embedding(model_inputs)
                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 87, in get_vllm_embedding
    vision_hidden_states.append(self.get_vision_embedding(pixel_values))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\arda\.cache\huggingface\modules\transformers_modules\MiniCPM-V\modeling_minicpmv.py", line 75, in get_vision_embedding
    vision_embedding = self.vpm.forward_features(pixel_value.unsqueeze(0).type(dtype))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\timm\models\vision_transformer.py", line 663, in forward_features
    x = self._pos_embed(x)
        ^^^^^^^^^^^^^^^^^^
  File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\timm\models\vision_transformer.py", line 582, in _pos_embed
    pos_embed = resample_abs_pos_embed(
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\timm\layers\pos_embed.py", line 46, in resample_abs_pos_embed
    posemb = F.interpolate(posemb, size=new_size, mode=interpolation, antialias=antialias)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\arda\miniforge3\envs\xin-llm\Lib\site-packages\torch\nn\functional.py", line 4027, in interpolate
    return torch._C._nn._upsample_bicubic2d_aa(input, output_size, align_corners, scale_factors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Could not run 'aten::_upsample_bicubic2d_aa.out' with arguments from the 'XPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_upsample_bicubic2d_aa.out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

It shows timm could't run on xpu.

lei-sun-intel commented 3 weeks ago

patch the file as follows: posemb = F.interpolate(posemb, size=new_size, mode=interpolation, antialias=antialias) -> posemb = F.interpolate(posemb.to("cpu"), size=new_size, mode=interpolation, antialias=antialias).to(posemb.device) can fix your problem.

qiuxin2012 commented 3 weeks ago

Yes, you can follow https://github.com/intel-analytics/ipex-llm/issues/10470. I just run the generation successfully.

lei-sun-intel commented 3 weeks ago

Would you please help check the version of bigdl-llm? I just tried bigdl-llm==2.4.0, and it does NOT work either. In my env, $ pip list |grep bigdl shows as follows. bigdl-core-xe-21 2.5.0b20240610 bigdl-core-xe-addons-21 2.5.0b20240610 bigdl-core-xe-batch-21 2.5.0b20240610 bigdl-core-xe-esimd-21 2.5.0b20240423 bigdl-llm 2.4.0

lei-sun-intel commented 3 weeks ago

I checked https://github.com/intel-analytics/ipex-llm/issues/10470, bigdl-llm==2.5.0b20240318, I will have a try.

lei-sun-intel commented 3 weeks ago

Can I do it with ipex--llm instead of bigdl-llm? Because I find I have NOT installed bigdl-llm.

qiuxin2012 commented 2 weeks ago

Please don't use bigdl-llm, bigdl-llm has now become ipex-llm (see the migration guide here) My python script

import torch
from PIL import Image
from ipex_llm.transformers import AutoModel
from transformers import AutoTokenizer
path = "D:\\llm-models\\MiniCPM-V"
print(path)
model = AutoModel.from_pretrained(path, 
                                  load_in_4bit=True,
                                  optimize_model=False,
                                  trust_remote_code=True,
                                  modules_to_not_convert=["vpm", "resampler"],
                                  use_cache=True)
model = model.float().to(device='xpu')
tokenizer = AutoTokenizer.from_pretrained(path,
                                          trust_remote_code=True)
model.eval()

image = Image.open("C:\\Users\\arda\Desktop\\tiger.jpeg").convert('RGB')
question = 'What is in the image?'
msgs = [{'role': 'user', 'content': question}]

res, context, _ = model.chat(
    image=image,
    msgs=msgs,
    context=None,
    tokenizer=tokenizer,
    sampling=True,
    temperature=0.7
)
print(res)

lei-sun-intel commented 2 weeks ago

I want to convert the model to int4, Any update?

qiuxin2012 commented 2 weeks ago

model = AutoModel.from_pretrained(path, 
                                  load_in_4bit=True,
                                  optimize_model=False,
                                  trust_remote_code=True,
                                  modules_to_not_convert=["vpm", "resampler"],
                                  use_cache=True)

With upon code, it's already loading into int4.

qiuxin2012 commented 1 week ago

@lei-sun-intel You error message is throw during the generation. The model should be saved already. Traceback (most recent call last): File "/home/test10/ipex-llm/python/llm/example/GPU/ModelScope-Models/Save-Load/./generate.py", line 66, in output = model.generate(input_ids, File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 211, in generate model_inputs = self._process_list(tokenizer, data_list, max_inp_length) File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 167, in _process_list input_tensors.append(self._convert_to_tensors(tokenizer, data, max_inp_length)) File "/home/test10/.cache/huggingface/modules/transformers_modules/MiniCPM-V/modeling_minicpmv.py", line 138, in _convert_to_tensors if tokenizer.add_bos_token: AttributeError: 'NoneType' object has no attribute 'add_bos_token'

Tokenizer has attribute no add_bos_token, it's an error for transformers version mismatch. But you said you have installed the right version. I think it's something wrong in your environment. My suggestion: you can create a new conda environment and try my code above.

lei-sun-intel commented 1 week ago

path = "./models/OpenBMB/MiniCPM-V" save_path = "./models/OpenBMB/MiniCPM-V-int4"

model = AutoModel.from_pretrained(path, load_in_4bit=True, optimize_model=False, trust_remote_code=True, modules_to_not_convert=["vpm", "resampler"], use_cache=True) model = model.float().to(device='xpu') tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True) model.eval()

model.save_low_bit(save_path) tokenizer.save_pretrained(save_path) print(f"Model and tokenizer are saved to {save_path}")

Finally, the above code fix the problem. Thanks a lot!

intel-analytics / ipex-llm

When convert MiniCPM-V model from ModelScope to low-bit, got error: AttributeError: 'NoneType' object has no attribute 'add_bos_token' #11390