lyogavin airllm issues - Githubissues

lyogavin / airllm

AirLLM 70B inference with single 4GB GPU

Apache License 2.0

5.28k stars 423 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Is AirLLM faster than llama.cpp?

#206 Lizonghang opened 10 hours ago
0
docker based or BareMetal serving

#205 dhandhalyabhavik opened 1 week ago
0
Quantization Not Working as Expected

#204 sdt03 opened 1 week ago
0
not support prefetching for compression for now. loading with no prepetching mode.

#203 gokulcoder7 opened 1 week ago
0
try setting attn impl to sdpa...

#202 gokulcoder7 opened 1 week ago
0
how to add support for bolt.new-any-llm

#201 rahulmr opened 1 week ago
0
No module named 'sentencepiece' when following install instructions

#200 drdozer opened 2 weeks ago
1
Integration with ollama server

#199 drdozer opened 2 weeks ago
0
unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit

#198 werruww opened 2 weeks ago
0
AttributeError: 'dict' object has no attribute 'T' (Mac)

#197 shakedzy opened 2 weeks ago
0
support for https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ?

#196 mahald opened 3 weeks ago
2
Update README.md

#195 moresearch opened 3 weeks ago
0
Is there any practical usecase of this project ?

#194 Greatz08 opened 3 weeks ago
2
Are multi-gpu supported?

#193 wedobetter opened 3 weeks ago
0
it is run

#192 werruww opened 3 weeks ago
11
errors

#191 werruww opened 3 weeks ago
1
No supported model list

#190 rudiservo opened 4 weeks ago
0
Support for Vision and Language models

#188 versae opened 1 month ago
0
airllm/utils.py:302 list index out of range

#187 fvisconti opened 1 month ago
1
taking about 40 minutes to generate one sentence，Is this speed normal?

#186 kingdoom1 opened 1 month ago
2
Issue `model.safetensors.index.json should exist` with loading model in safetensors format

#185 LeMoussel opened 1 month ago
3
docs: update README.md

#184 eltociear closed 1 month ago
0
How to alther the default saved path of downloaded LLM?

#183 fengnex opened 1 month ago
1
B70 need

#182 ayttop opened 2 months ago
0
How to set system prompt

#181 OKHand-Zy opened 2 months ago
1
unsloth/Meta-Llama-3.1-405B-Instruct-bnb-4bit

#180 kendiyang opened 2 months ago
2
delete_original

#179 ayttop opened 2 months ago
4
RuntimeError: shape '[1, 5, 8, 128]' is invalid for input of size 10240 LLama 405B 4-bit on Layer 1

#178 TitleOS opened 2 months ago
3
Compression does not work with MLX / Apple Silicon

#177 sammcj opened 2 months ago
0
Fix pip not found when install in Jupyter

#176 chinkan closed 2 months ago
0
CUDA Out of memory RTX 4060TI 16G

#175 1272870698 opened 2 months ago
0
Fixing mlx model load

#174 Razikus closed 2 months ago
1
added delete_original support for single modelfiles

#173 NavodPeiris closed 3 months ago
0
RuntimeError: shape '[1, 13, 8, 128]' is invalid for input of size 26624

#172 zhuojun1024 opened 3 months ago
6
#169: fixed error when running on cpu and added post install command to upgrade transformers

#170 NavodPeiris closed 3 months ago
0
Error when running on CPU device and rope_scaling error when using old version of transformers

#169 NavodPeiris closed 3 months ago
1
mlx Linear weight arrays were loaded with a dict of arrays

#168 shiwanlin closed 3 months ago
2
mlx embedding indexing failure - ValueError: Cannot index mlx array using the given type.

#167 shiwanlin closed 3 months ago
2
how to increase speed of inference

#166 Tdrinker opened 3 months ago
1
Position Embedding with Seq > 512

#165 Codys12 opened 3 months ago
1
Data Parallel across multiple GPUs?

#164 Codys12 opened 3 months ago
0
name 'dynamically_import_QuantLinear' is not defined

#163 gyyixr opened 3 months ago
1
layer_name 在使用前没有被定义

#162 yjleo17 opened 3 months ago
4
Circular import error in importing partially initialised module airllm

#161 samarthpusalkar closed 3 months ago
1
AssertionError: model.safetensors.index.json should exist

#160 huangyifu opened 3 months ago
0
I can’t run llama-3.1-405B-Instruct-bnb-4bit because of a ValueError: rope_scaling must be a dictionary with two fields.

#159 LCG22 opened 3 months ago
1
can not run llama 3.1 405B

#158 taozhiyuai opened 3 months ago
2
docs: add Japanese README

#157 eltociear closed 3 months ago
0
AttributeError: 'AirLLMLlama2' object has no attribute '_supports_cache_class'

#156 Source61 opened 4 months ago
2
Ramdisk

#155 HennethAnnun opened 4 months ago
0