Open danielclough opened 3 months ago
Actually using mixtral on metal was falling back to f32
as we didn't have bf16
support for matmul on gemm. So this would require > 200GB of memory to work. @ivarflakstad just added support for bf16
matmul in #2364 so you can give it another try with the changes in #2378 (cannot really try it as my mac is not beefy enough). If that works well, we'll generalize that to other bf16 based models.
My system is using DType::F32
from device.bf16_default_to_f32()
.
Can I provide any useful info for debugging?
Had to revert because while it was correct on M3 it was not on M1/M2. I’m looking into it :)
I tried Mixtral on my 128GB M1 Mac. The process state flips between
running
andstuck
.The cli output from the example is not progressing past this poing:
I started the process with the command from an up to date repo: