issues
search
mit-han-lab
/
TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
https://mit-han-lab.github.io/TinyChatEngine/
MIT License
634
stars
59
forks
source link
Add llama2 and clean up codebase
#17
Closed
meenchen
closed
11 months ago
meenchen
commented
11 months ago
This PR includes the following changes:
add llama2 support
clean up the model download script and docs.
refine generate function.
fix the capitals of file naming.
remove unused files/docs.
add token bin and vocab json by default.
improve performance of bmm op to reduce latency of long sequence.
remove metal-cpp source from the codebase.
put source/header related to nn modules into separate dirs.
clean up metal kernels.
reorganize kernel code structure.
This PR includes the following changes: