Wire models in MLX LM - Githubissues

Requires #1510

On an M2 Ultra (nothing else needed) the #s for Llama 3.1 70B in 16-bit precision look like:

	not wired	wired
prompt (16 toks)	2.35 toks/sec	27.8 toks/sec
generation (100 toks)	0.23 toks/sec	4.7 toks/sec

Command ran:

python -m mlx_lm.generate --model meta-llama/Meta-Llama-3-70B-Instruct --prompt "Write a story about Einstein" --max-tokens 100

ml-explore / mlx-examples