issues
search
b4rtaz
/
distributed-llama
Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
MIT License
1.03k
stars
69
forks
source link
feat: avg tokens / second.
#44
Closed
b4rtaz
closed
1 month ago