QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
7.43k stars 454 forks source link

💡 [REQUEST] - <有没有vLLM 部署Qwen1.5-110b-chat版本的?> #392

Closed ucas010 closed 3 months ago

ucas010 commented 4 months ago

起始日期 | Start Date

No response

实现PR | Implementation PR

实现最新模型的部署

相关Issues | Reference Issues

No response

摘要 | Summary

部署最新大模型

基本示例 | Basic Example

缺陷 | Drawbacks

未解决问题 | Unresolved questions

No response

ucas010 commented 4 months ago

int4版本的有没有啊?咋实现啊?

jklj077 commented 4 months ago

The quantized model (using AWQ) is provided here: https://huggingface.co/Qwen/Qwen1.5-110B-Chat-AWQ