[BOUNTY - $200] Support MLX community models in tinygrad inference engine

This is a follow up to #148
In general model weights on huggingface are a bit of a mess because of different implementations in ML libraries. For example, tinygrad implementation of models name things slightly differently to MLX implementation, which names things slightly different to torch implementation
This means we need to have some code that "converts" these names / structure to the tinygrad one
Right now there's some code that already does this to convert from the huggingface torch implementation to tinygrad: https://github.com/exo-explore/exo/blob/41f0a22e76ae57f5993fd57695fb4b3200e29c50/exo/inference/tinygrad/models/llama.py#L220-L249. We just need something that can also deal with MLX community models e.g. https://huggingface.co/mlx-community/Meta-Llama-3.1-8B-Instruct-4bit
Note, you can look at how MLX does this here (you might be able to share a lot of code from there): https://github.com/ml-explore/mlx-examples/blob/bd29aec299c8fa59c161a9c1207bfc59db31d845/llms/mlx_lm/utils.py#L700

exo-explore / exo