issues
search
mit-han-lab
/
TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
https://mit-han-lab.github.io/TinyChatEngine/
MIT License
622
stars
57
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fix VILA model version name for MacOS
#113
arpitjain2811
opened
1 week ago
0
Fix CUDA implementation
#112
RaymondWang0
closed
1 week ago
0
No EOS when prompt "exit"?
#111
MoonBlvd
opened
1 week ago
0
Block size = 32 assertion fails
#110
rukshankr
opened
2 weeks ago
0
Buffer overflow with Llama 3 8B
#109
renepeinl
opened
3 weeks ago
0
Compilation error in Gelu
#108
renepeinl
opened
3 weeks ago
0
Support Llama-3 and Mistral models
#107
RaymondWang0
closed
1 month ago
0
No such file or directory during compilation
#106
saeid93
opened
1 month ago
1
No such file or directory
#105
Alpslee
opened
1 month ago
2
Support server chat mode
#104
hyperbolic-c
opened
2 months ago
0
Fix bugs for ARM platforms
#103
RaymondWang0
closed
2 months ago
0
Update models
#102
RaymondWang0
closed
2 months ago
0
Unable to deploy models in Android
#101
sqzhang-jeremy
closed
2 months ago
2
Upload model checkpoints on the Hugging Face Hub.
#100
Vaibhavs10
opened
3 months ago
1
Assets for tests
#99
julian-q
opened
3 months ago
0
Voice chat availability
#98
Dudu014
opened
4 months ago
0
Update VILA and UI
#97
RaymondWang0
closed
4 months ago
0
Jetson Nano Orin 8GB running out of memory on LLaMA2_7B_chat_awq_int4
#96
Dudu014
closed
4 months ago
1
make chat undefined reference to `LLaVAGenerate
#95
cuu
opened
4 months ago
1
Support VILA
#94
RaymondWang0
closed
4 months ago
0
MetalGPU branch not buildable?
#93
CoryXie
closed
4 months ago
1
Windows CUDA Make chat problem
#92
M0rtale
opened
4 months ago
0
Support VILA
#91
RaymondWang0
closed
4 months ago
0
Error encountered during inference
#90
plasm0r
opened
4 months ago
0
Support LLaVA
#89
RaymondWang0
closed
5 months ago
0
Allocation of 'float inputs_embeds_buf[]' in Int4llamaDecoder::forward() causes Segmentation Fault for inputs longer than 511 tokens
#88
paulleo13
opened
5 months ago
0
metal gpu matrix3D addition test
#87
DerrickYLJ
opened
5 months ago
2
Converting an AWQ model to TinyChatEngine format example
#86
ylhsieh
opened
5 months ago
0
problem while running make chat
#85
Imran2708
opened
6 months ago
3
problem with - Loading model... Killed
#84
ecliipt
opened
6 months ago
0
Error while running make chat.
#83
s-swathib
opened
6 months ago
1
containerized as a Dockerfile
#82
bhpayne
opened
6 months ago
0
fix matrix3d int type error for windows
#81
xieqihui
closed
4 months ago
0
StarCoder model and AWQ file formats
#80
167rgc911
closed
7 months ago
1
Quality of life fixes for GPU users and future development
#79
Jiminator
closed
7 months ago
0
Support StarCoder on CPU
#78
RaymondWang0
closed
7 months ago
0
Python extension for Metal kernels
#77
casper-hansen
opened
7 months ago
0
Using CMakeLists to compile the code?
#76
dt1729
opened
7 months ago
0
Installing nlohmann-json3-dev is required
#75
dt1729
closed
7 months ago
0
CPU Optimization
#74
RaymondWang0
closed
8 months ago
0
The program crashed when input long context on windows CPU
#73
Laeglaur
opened
8 months ago
1
Make Error: Nvidia Jetson Orin using arch=compute_87,code=sm_87
#72
sumedhreddy90
closed
8 months ago
1
Assistant spitting out non-readable characters on RTX 4060
#71
zhefciad
opened
8 months ago
1
Support CodeLLaMA
#70
RaymondWang0
closed
8 months ago
0
Create httpchat.cc
#69
omjee
opened
8 months ago
2
Revised CUDA support
#68
RaymondWang0
closed
8 months ago
0
Support new features
#67
RaymondWang0
closed
8 months ago
0
Unable to maintain chat history and continuous chat
#66
Rkyzzy
opened
9 months ago
1
Cleaned up output and support more models for voicechat
#65
Jiminator
closed
9 months ago
0
LLaMA2_7B_chat_awq_int4.zip Empty File
#64
tuobulatuo
closed
9 months ago
2
Next