SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
As part of the Qwen2 series, Qwen2-57B-A14B is released, which delivers competitive performance against 32B dense models, and would be really valuable addition to community! It however, requires the Qwen2MoeForCausalLM architecture which is currently not implemented in SGL. Is there any plan to support this? Thanks!
As part of the Qwen2 series, Qwen2-57B-A14B is released, which delivers competitive performance against 32B dense models, and would be really valuable addition to community! It however, requires the Qwen2MoeForCausalLM architecture which is currently not implemented in SGL. Is there any plan to support this? Thanks!
https://qwenlm.github.io/blog/qwen2/