Open Kailuo-Lai opened 1 year ago
Can you try:
model = AutoModel.from_pretrained("./checkpoints/chatglm2-6b-32k/",
load_in_low_bit="sym_int4",
trust_remote_code=True,
optimize_model=False)
@jason-dai Thanks, it works. And will this solution affect the efficiency of llm?
@jason-dai Thanks, it works. And will this solution affect the efficiency of llm?
Yes - when optimize_model
is True, we will apply more aggressive model optimizations, but it is less stable and you can set it to False if running into any issues; we'll take a look at how to enable it for chatglm2-6b-32k.
@jason-dai Thanks, it works. And will this solution affect the efficiency of llm?
Yes - when
optimize_model
is True, we will apply more aggressive model optimizations, but it is less stable and you can set it to False if running into any issues; we'll take a look at how to enable it for chatglm2-6b-32k.
Ok, I see. Thank you!
Hi, @Kailuo-Lai
We have enabled further model optimizations for chatglm2-6b-32k model now. Please wait 2.4.0b20231016
(which will be released tomorrow) or later version of bigdl-llm
to run the following code:
model = AutoModel.from_pretrained("THUDM/chatglm2-6b-32k",
load_in_low_bit="sym_int4",
trust_remote_code=True)
Hi, @Kailuo-Lai We have enabled further model optimizations for chatglm2-6b-32k model now. Please wait
2.4.0b20231016
(which will be released tomorrow) or later version ofbigdl-llm
to run the following code:model = AutoModel.from_pretrained("THUDM/chatglm2-6b-32k", load_in_low_bit="sym_int4", trust_remote_code=True)
Thank you, I will try in the future.
Code:
Output:
Env:
torch 2.0.1 bigdl-llm 2.4.0b20231007 transformers 4.31.0 langchain 0.0.248