stikkireddy / mlflow-extensions

Deploy models quickly to databricks via mlflow based serving infra.
https://stikkireddy.github.io/mlflow-extensions/
Apache License 2.0
19 stars 11 forks source link

[MODEL] glm4-9b-chat #73

Open jessiewen-databricks opened 3 weeks ago

jessiewen-databricks commented 3 weeks ago

Please describe your use case and why the current models may not support your need. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

In China/Taiwan, where majority of the use-cases requires better support for Chinese/Traditional Chinese, GLM series is a leading OSS model family.

Describe a preferred serving framework A clear and concise description of what you want to happen.

Per glm4-9b-chat Model info page on huggingface, it is compatible with vLLM "使用 vLLM后端进行推理"

Link to huggingface Add any other context or screenshots about the feature request here. https://huggingface.co/THUDM/glm-4-9b-chat