microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.91k stars 175 forks source link

I can't tell from documentation if we're meant to use a chat template or if it's automatically implemented? #448

Open sidagarwal2805 opened 8 months ago

sidagarwal2805 commented 8 months ago

For example at the moment I have a rough chat template:

"[INST] Classify the following text between the delimiters as "normal" or "abnormal" and output your response in JSON format. TEXT: {{{sample_text}}} [/INST] RESPONSE: "

Is this correct usage for Llama/Mistral models or should I not be using them at all?