Tlntin / ChatGLM2-6B-TensorRT

MIT License
90 stars 10 forks source link

RuntimeError: Number of IO tensors is not correct, must be 116, but you have 115 tensors #14

Open hanhan0521 opened 12 months ago

hanhan0521 commented 12 months ago

我按照您的指导,逐步执行了cpu版本上转换tensorRT的步骤,一切都很顺利。但是当我运行demo.py时,却出现了以下的错误,请问是什么原因呢?

(tensorRT) user@lsp-ws:~/data/ChatGLM2-6B-TensorRT$ python demo.py <module 'ckernel' from '/home/user/.cache/torch_extensions/py310_cu118/ckernel/ckernel.so'> <class 'ckernel.Kernel'> <instancemethod forward at 0x7f14fecfb8b0> INFO: Loaded engine size: 11916 MiB INFO: [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +11912, now: CPU 0, GPU 11912 (MiB) Traceback (most recent call last): File "/home/user/data/ChatGLM2-6B-TensorRT/demo.py", line 222, in model = Model("/ark-contexts/data/ChatGLM2-6B-TensorRT/models_fp32/chatglm6b2-bs1_with_cache.plan", 1) File "/home/user/data/ChatGLM2-6B-TensorRT/demo.py", line 26, in init self.kernel = Kernel(engine_path, batch_size) RuntimeError: Number of IO tensors is not correct, must be 116, but you have 115 tensors

Tlntin commented 12 months ago

demo.py暂时没适配,目前trt-llm已经在内测了,预计下个月发布,所以要不再等等?

hanhan0521 commented 12 months ago

好的,谢谢您!

Tlntin commented 12 months ago

不客气