[Bug] 推理报错 - Githubissues

kaoyansoft123 commented 8 months ago

Checklist

[ ] 1. I have searched related issues but cannot get the expected help.
[ ] 2. The bug has not been fixed in the latest version.

Describe the bug

model_source: hf_model WARNING: Can not find tokenizer.json. It may take long time to initialize the tokenizer. WARNING: Can not find tokenizer.json. It may take long time to initialize the tokenizer. model_config: { "model_name": "internlm-chat-7b", "tensor_para_size": 1, "head_num": 32, "kv_head_num": 32, "vocab_size": 103168, "num_layer": 32, "inter_size": 11008, "norm_eps": 1e-06, "attn_bias": 1, "start_id": 1, "end_id": 2, "session_len": 2056, "weight_type": "fp16", "rotary_embedding": 128, "rope_theta": 10000.0, "size_per_head": 128, "group_size": 0, "max_batch_size": 64, "max_context_token_num": 1, "step_length": 1, "cache_max_entry_count": 0.5, "cache_block_seq_len": 128, "cache_chunk_size": 1, "use_context_fmha": 1, "quant_policy": 0, "max_position_embeddings": 2048, "rope_scaling_factor": 0.0, "use_logn_attn": 0 } get 323 model params Exception in thread Thread-4 (_create_model_instance): Traceback (most recent call last): File "/mnt/bigdata/chatglm2/miniconda3/envs/xtuner-env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/mnt/bigdata/chatglm2/miniconda3/envs/xtuner-env/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/mnt/bigdata/chatglm2/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/lmdeploy/turbomind/turbomind.py", line 434, in _create_model_instance model_inst = self.tm_model.model_comm.create_model_instance( RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:169

session 1

Reproduction

lmdeploy chat turbomind internlm-chat-7b --model-name internlm-chat-7b

Environment

lmdeploy-0.1.0
cuda11.7
torch2.1.1
python10

Error traceback

No response

lvhan028 commented 8 months ago

Kindly let us know your target nvidia device

kaoyansoft123 commented 8 months ago

@lvhan028 Driver Version: 510.73.08 CUDA Version: 11.6

lvhan028 commented 8 months ago

I mean the hardware, A100, A10, V100 or others?

kaoyansoft123 commented 8 months ago

@lvhan028 v100

InternLM / lmdeploy

[Bug] 推理报错 #868

Checklist

Describe the bug

Reproduction

Environment

Error traceback