Please describe your use case and why the current models may not support your need.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
In China/Taiwan, where majority of the use-cases requires better support for Chinese/Traditional Chinese, GLM series is a leading OSS model family.
Describe a preferred serving framework
A clear and concise description of what you want to happen.
Per glm4-9b-chat Model info page on huggingface, it is compatible with vLLM "使用 vLLM后端进行推理"
Please describe your use case and why the current models may not support your need. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
In China/Taiwan, where majority of the use-cases requires better support for Chinese/Traditional Chinese, GLM series is a leading OSS model family.
Describe a preferred serving framework A clear and concise description of what you want to happen.
Per glm4-9b-chat Model info page on huggingface, it is compatible with vLLM "使用 vLLM后端进行推理"
Link to huggingface Add any other context or screenshots about the feature request here. https://huggingface.co/THUDM/glm-4-9b-chat