huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.33k stars 410 forks source link

chatglm model type is not supported yet in NormalizedConfig. #1282

Open stingoChen opened 11 months ago

stingoChen commented 11 months ago

Feature request

I tried to load the chatglm2 onnx model, but unfortunately got this reply. KeyError: 'chatglm model type is not supported yet in NormalizedConfig. Only albert, bart, bert, blenderbot, blenderbot_small, bloom, camembert, codegen, cvt, deberta, deberta-v2, deit, distilbert, donut-swin, electra, gpt2, gpt-bigcode, gpt_neo, gpt_neox, llama, gptj, imagegpt, longt5, marian, mbart, mt5, m2m_100, nystromformer, opt, pegasus, pix2struct, poolformer, regnet, resnet, roberta, speech_to_text, splinter, t5, trocr, whisper, vision-encoder-decoder, vit, xlm-roberta, yolos, mpt, gpt_bigcode are supported. If you want to support chatglm please propose a PR or open up an issue.'

Motivation

load chatglm2 onnx model

Your contribution

NO

stingoChen commented 11 months ago

from transformers import AutoTokenizer, AutoConfig from optimum.onnxruntime import ORTModelForCausalLM from optimum.onnxruntime import ORTModel import torch

from optimum.onnxruntime import ORTModelForCustomTasks

model_id = '/data1/wxd/HuangYuxuan/Projects/ChatGLM2-6B/model'

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) inputs = tokenizer("hellp", return_tensors="pt")

config = AutoConfig.from_pretrained('THUDM/chatglm2-6b', trust_remote_code=True, cache_dir="/data1/wxd/HuangYuxuan/Projects/cxk")

model = ORTModelForCausalLM.from_pretrained("./glm_onnx", config=config,

provider="CUDAExecutionProvider"

                                        )