exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
GNU General Public License v3.0
6.36k stars 329 forks source link

Where can I set the path for the large model I have already downloaded? #28

Open seasoncool opened 1 month ago

seasoncool commented 1 month ago

My environment seems to be installed successfully and I can open the chat website, but I'm unsure where to set the path for the large model?"

image

SOSONAGI commented 1 month ago

you should run the llama3_distributed.py (examples path) for running your prompt via on those peer!

seasoncool commented 1 month ago

I appreciate your response. I configured the path to my local large model within the 'example' directory, but when I started the log, it was still pointing towards the 'llama3' model.

image

SOSONAGI commented 1 month ago

exo currently supports the mlx and tinygrad format to inference via multiple peer gpus. Your chosen model seems like not the format of mlx or tinygrad. Should try the mlx and tinygrad format for inferences!

seasoncool commented 1 month ago

I downloaded this LLM from Hugging Face, and it should support mlx format. image

I've also set up the model path based on the error messages, but I'm still not succeeding. image

output: image

SOSONAGI commented 1 month ago

Here is my llama3_distributed.py code

logging.basicConfig(level=logging.DEBUG)

models = { "sosoai/hansoldeco-llama3-8b-instruct-v0.1-mlx": Shard(model_id="sosoai/hansoldeco-llama3-8b-instruct-v0.1-mlx", start_layer=0, end_layer=0, n_layers=32), "mlx-community/Meta-Llama-3-70B-Instruct-4bit": Shard(model_id="mlx-community/Meta-Llama-3-70B-Instruct-4bit", start_layer=0, end_layer=0, n_layers=80) }

path_or_hf_repo = "sosoai/hansoldeco-llama3-8b-instruct-v0.1-mlx" model_path = get_model_path(path_or_hf_repo) tokenizer_config = {} tokenizer = load_tokenizer(model_path, tokenizer_config)

I've also used my own mlx model for that but, there was no problem for inferences with other peers.

Can you check the llama3_distributed.py file path? I've moved this file onto main exo project path and run python3 llama3_distributed.py.

Hope this will help!