hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible
https://www.colossalai.org
Apache License 2.0
38.28k stars 4.3k forks source link

[BUG]: Colossal AI failed to load ChatGLM2 #5861

Open hiprince opened 4 days ago

hiprince commented 4 days ago

Is there an existing issue for this bug?

🐛 Describe the bug

I failed to run ChatGLM model with ColossalAI 0.3.6.

backtrace is here

KeyError Traceback (most recent call last) Cell In[4], line 112 110 else: 111 print('Skip launch colossalai') --> 112 benchmark_inference( 113 model_id, 114 "fp16", 115 max_input_len=max_input_len, 116 max_output_len=max_seq_len, 117 tp_size=tp_size, 118 batch_size=batch_size) 121 recorder.print()

Cell In[4], line 75, in benchmark_inference(model_id, dtype, max_input_len, max_output_len, tp_size, batch_size) 63 model = model.to(torch.bfloat16) 65 inference_config = InferenceConfig( 66 dtype=dtype, 67 max_batch_size=batch_size, (...) 73 use_cuda_kernel=True, 74 ) ---> 75 engine = InferenceEngine(model, tokenizer, inference_config, verbose=False) 77 generation_config = GenerationConfig( 78 pad_token_id=tokenizer.pad_token_id, 79 max_length=max_input_len + max_output_len, 80 # max_new_tokens=args.max_output_len, 81 ) 82 tokens=gen_tokens(tokenizer, dataset, dataset_format)

File ~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:75, in InferenceEngine.init(self, model_or_path, tokenizer, inference_config, verbose, model_policy) 72 self.verbose = verbose 73 self.logger = get_dist_logger(name) ---> 75 self.init_model(model_or_path, model_policy) 77 self.generation_config = inference_config.to_generation_config(self.model_config) 79 self.tokenizer = tokenizer

File ~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:148, in InferenceEngine.init_model(self, model_or_path, model_policy) 146 else: 147 modeltype = "nopadding" + self.model_config.model_type --> 148 model_policy = model_policy_map[model_type]() 150 pg_mesh = ProcessGroupMesh(self.inference_config.pp_size, self.inference_config.tp_size) 151 tp_group = pg_mesh.get_group_along_axis(TP_AXIS)

KeyError: 'nopadding_chatglm'

Environment

ColossalAI 0.3.6 PyTorch 2.3.1 CUDA 12.1 NV driver 545

yuehuayingxueluo commented 4 days ago

We have not yet adapted ChatGLM, but we will adapt these general models in the future.