local model loading without request to huggingface

artistlu commented 1 month ago

I have a large number of Mali GPU devices. I have already copied the 8B model to each of my nodes, but the latest version of EXO utilizes the download/hf functionality to download the models from the Hugging Face repository.

Given that my nodes are unable to access the Hugging Face servers, could the project provide a way to configure the model's local path and load the models directly from the local storage? This would be extremely helpful for users like myself who have multiple Mali GPU devices and cannot rely on the remote model downloading functionality.

Additionally, I have a few questions regarding the download/hf implementation:

When using download/hf, does each node download the full model, or is the model partitioned, and each node only loads a specific part of the model? If the model is partitioned, how can I configure the local paths for each partition to ensure that the model is loaded correctly on my Mali GPU-based nodes? Providing a solution that allows direct local model loading would be a game-changer for users like myself who are working with Mali GPU devices and cannot access remote model repositories. I would greatly appreciate any assistance the project can provide in addressing this issue.

Thank you in advance for your consideration and support.

AlexCheema commented 1 month ago

Renamed the issue. Let me know if this is accurate: we need an option to entirely disable the request to huggingface and only load from local disk.

artistlu commented 1 month ago

Renamed the issue. Let me know if this is accurate: we need an option to entirely disable the request to huggingface and only load from local disk.

If it could be achieved by configuring and loading local models, it would be perfect.

exo-explore / exo

local model loading without request to huggingface #146