epfLLM / Megatron-LLM

distributed trainer for LLMs
Other
504 stars 73 forks source link

Support QWen? #96

Open Vincent131499 opened 5 months ago

Vincent131499 commented 5 months ago

Great job! QWen is an open-source model widely used by the community. Does it support the training of this model?

martinjaggi commented 5 months ago

good idea, it would be nice if someone could make a PR to support Qwen 1.5 or later Qwen 2.

it should be quite easy as the architecture seems very similar to the ones we already support in the current code