What's the problem? (if there are multiple - list as bullet points)
"Prompt templates: One thing you missed: I really don't think the way you are implementing support for HF models is good. From the quick look I had here you just take the messages from the user & assistant and concatenate them in some really simple way (except for llama2-chat where you did implement support for the template for some reason).
That's not how these models are supposed to be used. Different models need a different way to translate the conversation into a prompt string. For an example on how to actually do this correctly, have a look at this file from the popular FastChat repository. I personally don't use their prompt templates and implement them myself, but that has to do with my requirement for evaluation which might not apply for you.
I think for FastChat, you might be able to also just import their library and use their prompt templates. Or copy them over if you want. Because I really wouldn't recommend implementing them all yourself. But just concatenating the message contents like that isn't correct and won't give actual good outputs."
What's the problem? (if there are multiple - list as bullet points) "Prompt templates: One thing you missed: I really don't think the way you are implementing support for HF models is good. From the quick look I had here you just take the messages from the user & assistant and concatenate them in some really simple way (except for llama2-chat where you did implement support for the template for some reason).
That's not how these models are supposed to be used. Different models need a different way to translate the conversation into a prompt string. For an example on how to actually do this correctly, have a look at this file from the popular FastChat repository. I personally don't use their prompt templates and implement them myself, but that has to do with my requirement for evaluation which might not apply for you.
I think for FastChat, you might be able to also just import their library and use their prompt templates. Or copy them over if you want. Because I really wouldn't recommend implementing them all yourself. But just concatenating the message contents like that isn't correct and won't give actual good outputs."