issues
search
b4rtaz
/
distributed-llama
Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
MIT License
1.4k
stars
94
forks
source link
rope slice.
#38
Closed
b4rtaz
closed
4 months ago