I began some investigation into a concrete HuggingEngine in the hf-concrete-base branch with the release of Chat Templates in Transformers v4.34 but found that it was pretty difficult to work with in the Kani model:
"malformatted" chat histories (e.g. two consecutive user messages) raised a raw Jinja2 error
difficult to get the token length of one chat message, due to the above (e.g. a chat history of a single asst message)
Instead, I think it's possible to make a more generic ChatMessage[] -> str prompt builder, which would make the lives of anyone who wants to implement a HuggingEngine easier and allow us to refactor the existing example HuggingEngine implementations. Some considerations:
ability to translate message types (e.g. FUNCTION -> USER)
ability to groupby message types for groupwise wrapping (e.g. USER, USER -> <s> [INST] USER [/INST]; SYSTEM, USER -> <s> [INST] <<SYS>> SYS <</SYS>> USER [/INST])
I began some investigation into a concrete HuggingEngine in the
hf-concrete-base
branch with the release of Chat Templates in Transformers v4.34 but found that it was pretty difficult to work with in the Kani model:Instead, I think it's possible to make a more generic ChatMessage[] -> str prompt builder, which would make the lives of anyone who wants to implement a HuggingEngine easier and allow us to refactor the existing example HuggingEngine implementations. Some considerations:
<s> [INST] USER [/INST]
; SYSTEM, USER -><s> [INST] <<SYS>> SYS <</SYS>> USER [/INST]
)