Closed caseybasichis closed 2 weeks ago
For the original instruct models they seem to have the wrong end of sentence token specified. You can run mlx lm with the flag --eos-token <|eot_id|>
. Or you can use the quantized models in the HF MLX Community. Those should have the right end of sentence token. If they don't let us know and we will fix.
Attempting to use mlx with Llama3 via mlx_lm and the generate util. It doesn't seem to respond to Llama's terminator.