A chat template in HuggingFace transformers can be a dict. This is the case, for example, for a popular Command-R model. For these models, a Runpod serverless endpoint created from the vLLM template will crash and reboot repeatedly, because OpenAIServingChat class expects a str in its chat_template constructor argument.
This patch fixes it, replacing a dict chat_template with a string value corresponding to a "default" key. This key should exist in all valid chat templates, as transformers would crash otherwise during apply_chat_template function.
A chat template in HuggingFace transformers can be a dict. This is the case, for example, for a popular Command-R model. For these models, a Runpod serverless endpoint created from the vLLM template will crash and reboot repeatedly, because OpenAIServingChat class expects a str in its chat_template constructor argument.
This patch fixes it, replacing a dict chat_template with a string value corresponding to a "default" key. This key should exist in all valid chat templates, as transformers would crash otherwise during apply_chat_template function.