intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.59k stars 244 forks source link

Unsupported input type, fallback to the origin model #534

Closed akarX23 closed 1 month ago

akarX23 commented 7 months ago

Describe the issue

I am trying to run the "meta-llama/llama-2-7b-chat-hf" with the llm-on-ray framework, however I am getting the following output:

(ServeReplica:router:PredictorDeployment pid=3040030) /home/develop/.anaconda3/envs/llm-on-ray/lib/python3.9/site-packages/intel_extension_for_pytorch/transformers/models/reference/modules/attentions.py:962: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
(ServeReplica:router:PredictorDeployment pid=3040030)   + torch.tensor(combined_attention_mask)
(ServeReplica:router:PredictorDeployment pid=3040030) /home/develop/.anaconda3/envs/llm-on-ray/lib/python3.9/site-packages/intel_extension_for_pytorch/transformers/optimize.py:683: UserWarning: fail to apply optimize_transformers due to: Unsupported input type, fallback to the origin model

I think llama-2-7b-chat-hf should be supported for optimizations. These are the versions I am using:

Kindly assist in solving this issue. Thank you!

Vasud-ha commented 7 months ago

Hi @akarX23, I will reproduce the issue and get back to you.

akarX23 commented 7 months ago

Sure, thank you for your time and support @Vasud-ha

Vasud-ha commented 7 months ago

Hi @akarX23, could you update the version for transformers>=4.35.0. Refer from https://github.com/intel/llm-on-ray/tree/main

akarX23 commented 7 months ago

Hi @Vasud-ha , the reason I am using transformers 4.31 is because bigdl-all requires transformers version 4.31, this is the output of pip install ".[cpu,bigdl-cpu]":

134.0 The conflict is caused by:
134.0     llm-on-ray[bigdl-cpu,cpu] 0.0.1 depends on transformers>=4.35.0; extra == "cpu"
134.0     bigdl-llm[all] 2.5.0b20240222 depends on transformers==4.31.0; extra == "all"

I would have to use either ipex or bigdl one at a time

akarX23 commented 7 months ago

Can you suggest a version of bigdl-llm compatible with transformers 4.35 or 4.37?

Vasud-ha commented 7 months ago

Could you try installing bigdl-llm with this command pip install --pre --upgrade bigdl-llm[xpu_2.1] -f https://developer.intel.com/ipex-whl-stable-xpu with transformer 4.35 or newer?

xwu99 commented 7 months ago

@akarX23 need to update to torch/ipex to 2.2 and transformers to 4.35.2, check this: https://github.com/intel/llm-on-ray/pull/143

akarX23 commented 7 months ago

@Vasud-ha @xwu99, I will try these suggestions soon. Currently, I have shifted to OVMS with ITREX, the performance is pretty good. I will update you with what I find.

ZailiWang commented 1 month ago

Hi @akarX23 are you still working on llm-on-ray, and does the issue persist? Thanks.

akarX23 commented 1 month ago

Hi @ZailiWang , currently the work with llm-on-ray has stopped. I have not checked in if the issue persists. We have customers who will be trying out llm-on-ray soon, if anything comes up I will raise the issue. Thank you for the help!