Open shiqingzhangCSU opened 6 months ago
Yes, this repo only provides solution to support QWen1.5 (by replacing Qwen1). And TensorRT-LLM has already natively support Qwen1. So if you only optimize Qwen1, this repo can't help.
Yes, this repo only provides solution to support QWen1.5 (by replacing Qwen1). And TensorRT-LLM has already natively support Qwen1. So if you only optimize Qwen1, this repo can't help.
Thanks for you reply. I want use both Qwen1, QWen1.5 & 2, and I think I should add a new model type QWen2 in TensorRT-LLM( not just replacing Qwen1). Furthermore, Can it be adapted to one file like chatglm1, 2, 3, glm?
For the case, you need to use both Qwen1, Qwen1.5/2, I haven't to do the solution yet. but you can do optimization sequentially.
As title