Fix Qwen2 moe - Githubissues

ml-explore / mlx-examples

Examples in the MLX framework

MIT License

5.5k stars 791 forks source link

Fix Qwen2 moe #822

Closed Blaizzy closed 3 weeks ago

Blaizzy commented 3 weeks ago

This PR fixes Qwen2 MoE conversion and inference. Twitter: @Prince_Canuma

awni commented 3 weeks ago

This doesn’t look like the correct fix to me. We need to merge scales if they are there. Could you point to the model that didn’t work so we can debug?

Blaizzy commented 3 weeks ago

I could be wrong but from what I can see, both Qwen moe models don’t use scales.

Precisely, the Qwen-2-57B-A14B model failed to convert.

awni commented 3 weeks ago

Those are scales for the quantization so indeed they won't be in the fp16 models.

awni commented 3 weeks ago

I do not have a problem converting Qwen/Qwen2-57B-A14B-Instruct on main. Can you share the error message you are getting?

Blaizzy commented 3 weeks ago

But they will be present in quantized models. I see it now.

I will add a condition to skip scales for f16 or f32 models.

awni commented 3 weeks ago

I will add a condition to skip scales for f16 or f32 models.

You don't need to add that, it will not try to use scales if they are not present already. See Line 228.

I think maybe the problem is you were trying to convert using an old MLX LM. The current version (on main) already works for Qwen2 MOEs.

Blaizzy commented 3 weeks ago

I do not have a problem converting Qwen/Qwen2-57B-A14B-Instruct on main. Can you share the error message you are getting?

Alright, give me a 15min to redownload the model.

You don't need to add that, it will not try to use scales if they are not present already. See Line 228.

I noticed, and saw the git blame to trace the changes. From what I remember, that's the one causing the error. But let me double check.

Blaizzy commented 3 weeks ago

That's weird, today the same code is working fine.

Yesterday both pypi and github installations gave an error.

An error exactly in line 228 where it said something about the first expert didn't have scales but I didn't save it anywhere.

When I made the change proposed here it worked.