Closed nguyenhoangthuan99 closed 1 month ago
@nguyenhoangthuan99 can you elaborate on this issue? Are you talking about only including jinja2 in cortex.llamacpp, instead of the overall cortexcpp packager or soemthing else?
Problem
Solution
All model with gguf file format only run with cortex.llamacpp engines, for that reason, we will move the part parse chat template for cortex.llamacpp engines. And this part will be executed during runtime (when user start a model using cortex.llamacpp engine, it will parse chat template).
This solution require more effort and can save 60 Mb of binary file.
closing issue, thanks @nguyenhoangthuan99