Open qianjyM opened 6 months ago
It is not supported to use different dimension in each layer. If you want to run your model, you could implement a new model based on existing model, and set different shape for each layer. It might also affect other parts like the checkpoint converter.
Hi @qianjyM do u still have further issue or question now? If not, we'll close it soon.
Hi there,
I just want to ask that for the pruned model, how can we deploy it using TensorRT-LLM? Since the qkv dimensions in each layer are different, the model is stored using torch.save rather than save_pretrained. So I'm a little confused about how to use TensorRT-LLM with this model? Could you please give me some tips or advice?
Thanks!