issues
search
lyogavin
/
airllm
AirLLM 70B inference with single 4GB GPU
Apache License 2.0
4.01k
stars
332
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to set system prompt
#181
OKHand-Zy
opened
6 days ago
1
unsloth/Meta-Llama-3.1-405B-Instruct-bnb-4bit
#180
kendiyang
opened
1 week ago
2
delete_original
#179
ayttop
opened
1 week ago
4
RuntimeError: shape '[1, 5, 8, 128]' is invalid for input of size 10240 LLama 405B 4-bit on Layer 1
#178
TitleOS
opened
1 week ago
3
Compression does not work with MLX / Apple Silicon
#177
sammcj
opened
1 week ago
0
Fix pip not found when install in Jupyter
#176
chinkan
closed
1 week ago
0
CUDA Out of memory RTX 4060TI 16G
#175
1272870698
opened
2 weeks ago
0
Fixing mlx model load
#174
Razikus
closed
2 weeks ago
1
added delete_original support for single modelfiles
#173
NavodPeiris
closed
2 weeks ago
0
RuntimeError: shape '[1, 13, 8, 128]' is invalid for input of size 26624
#172
zhuojun1024
opened
3 weeks ago
6
#169: fixed error when running on cpu and added post install command to upgrade transformers
#170
NavodPeiris
closed
3 weeks ago
0
Error when running on CPU device and rope_scaling error when using old version of transformers
#169
NavodPeiris
closed
2 weeks ago
1
mlx Linear weight arrays were loaded with a dict of arrays
#168
shiwanlin
closed
3 weeks ago
1
mlx embedding indexing failure - ValueError: Cannot index mlx array using the given type.
#167
shiwanlin
closed
1 month ago
2
how to increase speed of inference
#166
Tdrinker
opened
1 month ago
1
Position Embedding with Seq > 512
#165
Codys12
opened
1 month ago
1
Data Parallel across multiple GPUs?
#164
Codys12
opened
1 month ago
0
name 'dynamically_import_QuantLinear' is not defined
#163
gyyixr
opened
1 month ago
1
layer_name 在使用前没有被定义
#162
yjleo17
opened
1 month ago
4
Circular import error in importing partially initialised module airllm
#161
samarthpusalkar
closed
1 month ago
1
AssertionError: model.safetensors.index.json should exist
#160
huangyifu
opened
1 month ago
0
I can’t run llama-3.1-405B-Instruct-bnb-4bit because of a ValueError: rope_scaling must be a dictionary with two fields.
#159
LCG22
opened
1 month ago
1
can not run llama 3.1 405B
#158
taozhiyuai
opened
1 month ago
2
docs: add Japanese README
#157
eltociear
closed
1 month ago
0
AttributeError: 'AirLLMLlama2' object has no attribute '_supports_cache_class'
#156
Source61
opened
1 month ago
2
Ramdisk
#155
HennethAnnun
opened
1 month ago
0
how to use Qwen2-72B-instuct
#154
shenhai-ran
opened
1 month ago
2
AssertionError: Torch not compiled with CUDA enabled
#153
smartdawg
opened
2 months ago
1
Some grammar suggested fixes in README.md
#152
TheTechOddBug
closed
1 month ago
0
No english readme for rlhf
#151
drawnwren
opened
2 months ago
0
How?
#150
nonetrix
closed
2 months ago
1
AttributeError: 'list' object has no attribute 'absmax' when I load Qwen-72B-Chat with 8-bit compression with AirLLMQWen
#149
Yang-bug-star
opened
2 months ago
0
I want to use in-context learning in qwen1.5-72b-chat inference and thus use tokenizer.apply_chat_template as in the official tutorial, however ValueError: max() arg. Doesn't airllm support the official inference way ?
#148
Yang-bug-star
opened
2 months ago
0
I want to use in-context learning in qwen1.5-72b-chat inference and thus use tokenizer.apply_chat_template as in the official tutorial, however ValueError: max() arg is an empty sequence
#147
Yang-bug-star
closed
2 months ago
0
Add support for Mistral model inference
#146
kunling-cxk
opened
2 months ago
0
ImportError: cannot import name 'AutoModel' from partially initialized module 'airllm' (most likely due to a circular import)
#145
leobilocastro
closed
3 months ago
0
Linear(in_features=28672, out_features=8192, bias=False) does not have a parameter or a buffer named qweight.
#144
luzacao
opened
3 months ago
0
WeChat QR Code out of date
#143
zixianwang2022
opened
3 months ago
0
air_llm: README fix MacOS typo
#142
hiemal
closed
3 months ago
0
safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
#137
chuangzhidan
opened
4 months ago
0
Insuficient disk space
#136
ulisesbussi
opened
4 months ago
3
CPU ram offload
#135
NicolasMejiaPetit
opened
4 months ago
0
error in apple mac m3
#134
mustangs0786
opened
4 months ago
5
Does airllm support quantized gguf/gptq/awq models ?
#133
robik72
opened
4 months ago
0
COMPILED_WITH_CUDA error requires libcuda.so
#132
nickums
opened
4 months ago
0
Error with Llama3: ValueError: Trying to set a tensor of shape torch.Size([1024, 8192]) in "weight" (which has shape torch.Size([8192, 8192])), this look incorrect.
#131
Cangshanqingshi
closed
4 months ago
0
跑不通chatglm3,请大佬指教。
#130
ZiQiangXie
opened
4 months ago
2
segmentation fault python3 airllm2.py
#129
taozhiyuai
opened
4 months ago
3
to run llama3-70b,but fail to import. why?
#128
taozhiyuai
closed
4 months ago
0
Any CoreML implementation plans?
#127
Proryanator
opened
4 months ago
0
Next