Add llama2 and clean up codebase - Githubissues

mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

https://mit-han-lab.github.io/TinyChatEngine/

MIT License

634 stars 59 forks source link

Add llama2 and clean up codebase #17

Closed meenchen closed 11 months ago

meenchen commented 11 months ago

This PR includes the following changes:

add llama2 support
clean up the model download script and docs.
refine generate function.
fix the capitals of file naming.
remove unused files/docs.
add token bin and vocab json by default.
improve performance of bmm op to reduce latency of long sequence.
remove metal-cpp source from the codebase.
put source/header related to nn modules into separate dirs.
clean up metal kernels.
reorganize kernel code structure.