intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.26k stars 1.23k forks source link

Nano : invalid dimensions when accelerate with onnx #4658

Open rnwang04 opened 2 years ago

rnwang04 commented 2 years ago

My calib dataloader is bsz = 1, after I quantize my model by

model_int8 = trainer.quantize(model, accelerator='onnxruntime',
                             calib_dataloader=train_dl, method='integer')

I get an error when I run model_int8 on dataloader which bsz=16, the error message is as follows : MicrosoftTeams-image There are only one input x in the batch, and I found this error was caused because this line https://github.com/intel-analytics/BigDL/blob/f682c06c54f7bf4afffd4ce10789bee5564bc9c9/python/nano/src/bigdl/nano/utils/inference/pytorch/model_utils.py#L25 .

TheaperDeng commented 2 years ago

talked about this offline, we will enhance the get_forward_args function

rnwang04 commented 2 years ago

talked about with @TheaperDeng basic solution : https://github.com/intel-analytics/BigDL/blob/25ad426630e913d615188ca4fee670aa9521fd79/python/nano/src/bigdl/nano/utils/inference/pytorch/model_utils.py#L25 ->

forward_args = inspect.getfullargspec(model.forward).args
if forward_args[0] == 'self':
    forward_args = forward_args[1:]