issues
search
microsoft
/
T-MAC
Low-bit LLM inference on CPU with lookup table
MIT License
588
stars
44
forks
source link
Merge latest llama.cpp with OpenMP for better multi-threading performance and more models such as qwen2.
#54
Closed
kaleid-liner
closed
1 month ago