When will the model code support the Qwen series models?

jzhang38 / EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Apache License 2.0

650 stars 47 forks source link

Open 233function opened 3 months ago

233function commented 3 months ago

Hello, author. When will the code framework support the extension of the context window for the Qwen series of models?

StrangeTcy commented 3 months ago

I was wondering the same thing. Will have to study the source code to find some llama-specific tricks, I guess

Kwen-Chen commented 3 months ago

In fact, Qwen has a similar architecture to Llama, and you can follow Llama's lead in supporting the Qwen family of models.