RuntimeError: Number of IO tensors is not correct, must be 116, but you have 115 tensors

hanhan0521 commented 12 months ago

我按照您的指导，逐步执行了cpu版本上转换tensorRT的步骤，一切都很顺利。但是当我运行demo.py时，却出现了以下的错误，请问是什么原因呢？

(tensorRT) user@lsp-ws:~/data/ChatGLM2-6B-TensorRT$ python demo.py <module 'ckernel' from '/home/user/.cache/torch_extensions/py310_cu118/ckernel/ckernel.so'> <class 'ckernel.Kernel'> <instancemethod forward at 0x7f14fecfb8b0> INFO: Loaded engine size: 11916 MiB INFO: [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +11912, now: CPU 0, GPU 11912 (MiB) Traceback (most recent call last): File "/home/user/data/ChatGLM2-6B-TensorRT/demo.py", line 222, in model = Model("/ark-contexts/data/ChatGLM2-6B-TensorRT/models_fp32/chatglm6b2-bs1_with_cache.plan", 1) File "/home/user/data/ChatGLM2-6B-TensorRT/demo.py", line 26, in init self.kernel = Kernel(engine_path, batch_size) RuntimeError: Number of IO tensors is not correct, must be 116, but you have 115 tensors

Tlntin commented 12 months ago

demo.py暂时没适配，目前trt-llm已经在内测了，预计下个月发布，所以要不再等等？

hanhan0521 commented 12 months ago

好的，谢谢您！

Tlntin commented 12 months ago

不客气

Tlntin / ChatGLM2-6B-TensorRT

RuntimeError: Number of IO tensors is not correct, must be 116, but you have 115 tensors #14