issues
search
b4rtaz
/
distributed-llama
Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
MIT License
1.02k
stars
68
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[New Feature] Add new route for dllama api for embeding models
#96
testing0mon21
opened
1 day ago
1
refactor.
#95
b4rtaz
closed
3 days ago
0
Support for GGUF files?
#94
ravor-org
opened
4 days ago
1
Hugging Face models without tokenizer.model file
#93
EntusiastaIApy
closed
3 days ago
2
Exception: max_seq_len is required, please update params.json with convert-llama.py on Meta-Llama-3-8B-Instruct
#92
unclemusclez
closed
4 days ago
1
feat: vulkan.
#91
b4rtaz
closed
3 days ago
2
feat: accelerator structure.
#90
b4rtaz
closed
2 weeks ago
0
What about mobile phones?
#89
dcale
opened
2 weeks ago
4
fix: windows wsa startup.
#88
b4rtaz
closed
4 weeks ago
0
what(): Cannot create socket
#87
Slaghton
opened
4 weeks ago
1
dllama-api invokes "what(): Invalid tokenizer file "
#86
unclemusclez
closed
1 month ago
2
feat: update readme, add model.
#85
b4rtaz
closed
1 month ago
0
feat: optional weights float type argument.
#84
b4rtaz
closed
1 month ago
0
feat: tokenizer v1.
#83
b4rtaz
closed
1 month ago
0
dllama-api hosted on 127.0.0.1
#82
unclemusclez
opened
1 month ago
2
float-type f32 will not start
#81
unclemusclez
opened
1 month ago
2
master and worker started but with problems
#80
fabgat
opened
1 month ago
8
support multi nvidia jetson agx orin?
#79
WangFengtu1996
opened
1 month ago
3
convert into .bin
#78
fabgat
closed
1 month ago
2
Request: Community Discord?
#77
unclemusclez
closed
1 month ago
1
feat: add to tokenizer chat configuration.
#76
b4rtaz
closed
1 month ago
5
feat: naive cache.
#75
b4rtaz
closed
1 month ago
0
fix: windows fseek.
#74
b4rtaz
closed
1 month ago
0
Add additional chat templates to dllama-api
#73
DifferentialityDevelopment
closed
1 month ago
8
chore: refactor http request a bit.
#72
b4rtaz
closed
1 month ago
0
[Feature Suggest] Config File alternative to Command Line Arguments
#71
DifferentialityDevelopment
closed
1 month ago
2
Support nSlices > nKvHeads
#70
b4rtaz
opened
1 month ago
0
[Feature Suggest] From All-Reduce to Ring-All-Reduce
#69
zhengpeirong
opened
1 month ago
1
Support for another models (ollama models)
#68
testing0mon21
opened
1 month ago
3
[Setup] Multiple Apple Silicon Macs: Questions
#67
s04
opened
1 month ago
1
chore: dllama-api tiny clean up.
#66
b4rtaz
closed
1 month ago
0
fix: chunked stream, close stream without econnreset.
#65
b4rtaz
closed
1 month ago
0
feat: speed up synchronization of mlp.
#64
b4rtaz
closed
1 month ago
1
feat: windows support
#63
DifferentialityDevelopment
closed
1 month ago
20
feat: convert-hf.py
#62
b4rtaz
closed
1 month ago
0
fix: use non-blocking sockets.
#61
b4rtaz
closed
1 month ago
0
(Crashing on Low Memory SBC) main invoked oom-killer: gfp_mask=0x1100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
#59
unclemusclez
closed
1 month ago
51
network utilization
#58
zhengpeirong
opened
1 month ago
3
feat: use avx2 to speedup dotProduct
#57
b4rtaz
closed
1 month ago
0
feat: use avx2 to speedup matmulF32
#56
b4rtaz
closed
1 month ago
0
How To Add Suppoerted Model
#55
hyperbolic-c
opened
1 month ago
2
Use AVX2 to speedup matmulQ40
#54
DifferentialityDevelopment
closed
1 month ago
3
Use AVX2 to speedup matmulQ40
#53
DifferentialityDevelopment
closed
1 month ago
2
Add safe tensor support to convert-llama.py
#52
DifferentialityDevelopment
closed
1 month ago
10
fix: convert-llama.py supports different max_seq_len.
#51
b4rtaz
closed
1 month ago
0
Vulkan Acceleration
#50
DifferentialityDevelopment
opened
1 month ago
35
chore: update macbeth.sh
#49
eltociear
closed
1 month ago
2
terminate called after throwing an instance of 'ReadSocketException'
#48
unclemusclez
opened
1 month ago
35
API Server
#47
DifferentialityDevelopment
closed
1 month ago
3
feat: splitting multihead attention into all nodes.
#46
b4rtaz
closed
1 month ago
5
Next