b4rtaz / distributed-llama

Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
MIT License
1.5k stars 107 forks source link

Segmentation fault #127

Open YueZhan721 opened 1 month ago

YueZhan721 commented 1 month ago

image Hello, thank you very much for your work. I am having a problem as shown in the picture. My steps are 1. fine-tune the llama2 initial model with unsloth, save the weights in .safetensors format; 2. use the convert-hf.py script to convert to .m format and .t format; 3. run it on a single Raspberry Pi. Hope hear from you, thanks again.

b4rtaz commented 1 month ago

Hello @YueZhan721!

Could you run this model on a more powerful computer? How much RAM does your Raspberry Pi have?

YueZhan721 commented 1 month ago

Hello @b4rtaz I can run it on the google colab. The Raspberry Pi I used is 5 with 8GB RAM, the .m weight file is about 2.3 GB. I'm wondering if I'm doing something wrong with the quantization or formatting process.

YueZhan721 commented 1 month ago

By the way, I found that many llama3 models have no tokenizer.model file, how can I convert it from .safetensors to .m. Thanks for your reply very much.

b4rtaz commented 1 month ago

Llama 2 7B should be approximately 3.95GB after quantization to Q40. If it's only 2.3 GB, something might be wrong.

Could you run any model on your Raspberry Pi (for example python launch.py tinyllama_1_1b_3t_q40)?

b4rtaz commented 1 month ago

how can I convert it from .safetensors to .m. Thanks for your reply very much.

Check convert-tokenizer-hf.py. It can convert some HF models.

YueZhan721 commented 1 month ago

Llama 2 7B should be approximately 3.95GB after quantization to Q40. If it's only 2.3 GB, something might be wrong.

Could you run any model on your Raspberry Pi (for example python launch.py tinyllama_1_1b_3t_q40)?

Yes, I can run those models(.m file) you provided. There may be something wrong when I save and convert the weight files. Thanks sincerely! Good work, and help to me.