At the very least when using the new Qwen2.5 models that are still the Qwen2 architecture when making the YAML file for the MoE the architecture line needs to be verbatim architecture: Qwen MoE otherwise it won't accurately detect the Qwen models.
Thanks for the doc fix! It looks like you're also bringing in the changes from multi-module-architecture though - that isn't quite ready to merge into main. Could you please base this off of main?
At the very least when using the new Qwen2.5 models that are still the Qwen2 architecture when making the YAML file for the MoE the
architecture
line needs to be verbatimarchitecture: Qwen MoE
otherwise it won't accurately detect the Qwen models.