huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Apache License 2.0

2.33k stars 410 forks source link

Feature request

I tried to load the chatglm2 onnx model, but unfortunately got this reply. KeyError: 'chatglm model type is not supported yet in NormalizedConfig. Only albert, bart, bert, blenderbot, blenderbot_small, bloom, camembert, codegen, cvt, deberta, deberta-v2, deit, distilbert, donut-swin, electra, gpt2, gpt-bigcode, gpt_neo, gpt_neox, llama, gptj, imagegpt, longt5, marian, mbart, mt5, m2m_100, nystromformer, opt, pegasus, pix2struct, poolformer, regnet, resnet, roberta, speech_to_text, splinter, t5, trocr, whisper, vision-encoder-decoder, vit, xlm-roberta, yolos, mpt, gpt_bigcode are supported. If you want to support chatglm please propose a PR or open up an issue.'

Motivation

load chatglm2 onnx model

Your contribution

from transformers import AutoTokenizer, AutoConfig from optimum.onnxruntime import ORTModelForCausalLM from optimum.onnxruntime import ORTModel import torch

from optimum.onnxruntime import ORTModelForCustomTasks

model_id = '/data1/wxd/HuangYuxuan/Projects/ChatGLM2-6B/model'

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) inputs = tokenizer("hellp", return_tensors="pt")

config = AutoConfig.from_pretrained('THUDM/chatglm2-6b', trust_remote_code=True, cache_dir="/data1/wxd/HuangYuxuan/Projects/cxk")

model = ORTModelForCausalLM.from_pretrained("./glm_onnx", config=config,

provider="CUDAExecutionProvider"

huggingface / optimum

chatglm model type is not supported yet in NormalizedConfig. #1282

Feature request

Motivation

Your contribution

provider="CUDAExecutionProvider"