Enhancement: support local and custom models

At the moment, exo only supports loading models that are in the huggingface cache-folder. In case you have your own quants of a given model, you'd need to create a folder under the HF-cache that follows the conventions of HF-downloader (like for example: "~/.cache/huggingface/hub/models--mlx-community--Meta-Llama-3.1-8B-Instruct-4bit/refs/main") with a dummy commit sha hash.

Would be great if exo could support loading models from any given folder, to support local and custom models. Thanks!

exo-explore / exo

Enhancement: support local and custom models #165