b4rtaz distributed-llama issues

b4rtaz / distributed-llama

Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.

MIT License

1.02k stars 68 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

JSONDecodeError("Expecting value", s, err.value) from None

#45 unclemusclez opened 1 month ago
10
feat: avg tokens / second.

#44 b4rtaz closed 1 month ago
0
fix: support max kv cache length.

#43 b4rtaz closed 1 month ago
0
feat: support for any number of threads.

#42 b4rtaz closed 1 month ago
0
Unknown header key's while converting llama 3 70b to distributed format

#40 DifferentialityDevelopment opened 1 month ago
1
Fleshing out API mode

#39 DifferentialityDevelopment closed 1 month ago
13
rope slice.

#38 b4rtaz closed 1 month ago
0
sync pos.

#37 b4rtaz closed 2 months ago
0
revert qkv.

#36 b4rtaz closed 2 months ago
0
Will this awesome proj consider supporting GPU acceleration？

#35 galenyu opened 2 months ago
1
sync qkv.

#32 b4rtaz closed 2 months ago
0
funcs-test.

#31 b4rtaz closed 2 months ago
0
To support Hugging Face model

#30 hyperbolic-c closed 1 month ago
10
[Feature Suggest] Tensor Parallellism for Accelerating LLM

#29 zhengpeirong opened 2 months ago
22
llamafile sgemm.

#28 b4rtaz closed 2 months ago
0
Assertion `d % nSlices == 0' failed.

#26 joelewing closed 2 months ago
2
Compiling error related to include of <ctime>

#25 joelewing closed 2 months ago
1
arch builder.

#24 b4rtaz closed 2 months ago
0
mixtral 8x22B support.

#22 b4rtaz closed 2 months ago
0
Need help in set up all the devices

#21 MarcuXu opened 2 months ago
0
Hi, do you know why the synchronization time from 4pi to 8pi suddenly increases？

#20 yuezhan0721 opened 2 months ago
15
How about the multi-core support of stand-alone dual-socket motherboards?

#19 win10ogod opened 3 months ago
4
grok-1 support.

#18 b4rtaz closed 2 months ago
1
Can I use Ollama model

#16 liyimeng closed 3 months ago
1
WebAssembly version

#15 pathquester closed 3 months ago
1
Fix typo

#13 VIS-WA closed 2 months ago
1
add distributed llama on docker container test

#11 weedge opened 4 months ago
1
Turing RK1 compute module results

#10 segabor closed 3 months ago
4
Master process crashes running out of memory on a 8 GB RPi 5

#8 segabor closed 5 months ago
15
converter.py OOM while converting llama-2-7b weights on my Raspberryi Pi 5

#4 segabor closed 5 months ago
2
feat: avx2

#2 b4rtaz closed 5 months ago
0