b4rtaz / distributed-llama

Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
MIT License
1.02k stars 68 forks source link

chore: update macbeth.sh #49

Closed eltociear closed 1 month ago

eltociear commented 1 month ago

calender -> calendar

b4rtaz commented 1 month ago

Weird... which tokenizer are you using?

b4rtaz commented 1 month ago

I downloaded once again the tokenizer and the model from here and started the macbeth.sh script again. The result is the correct on Macbook Pro M1.

I'm closing this PR for now.