Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
MIT License
1.53k
stars
109
forks
source link
[Feature Suggest] Support for AVX instruction set #107
I have couple of VMs but only plain AVX instruction set not AVX2. Is this project compatible with AVX ?