QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

[BUG] the max_position_embeddings parameter in the config.json for Qwen2-57B-A14B has been mistakenly set to 131072. #1287

Closed CFTangT closed 3 months ago

CFTangT commented 3 months ago

The context length for Qwen2-57B-A14B is 32k, but the default setting of max_position_embeddings and slide_window is 131072 in the config.json seems to be incorrect. In comparison, for Qwen2-57B-A14B-Instruct, the same setting is 32768, which appears to be more appropriate.

links: https://huggingface.co/Qwen/Qwen2-57B-A14B/blob/main/config.json