issues
search
mit-han-lab
/
TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
https://mit-han-lab.github.io/TinyChatEngine/
MIT License
760
stars
73
forks
source link
minor fix
#59
Closed
RaymondWang0
closed
1 year ago
RaymondWang0
commented
1 year ago
Summary of Changes
Added a new feature to support shared memory for decoder layers, which could reduce memory usage
It's still an experimental feature and needs further optimization
Removed
offset
to improve memory usage for CPU backend
Updated README
Added instructions for Python packages
Added instructions for RPi & CUDA support
minor fixes
Updated download_model.py
Summary of Changes
offset
to improve memory usage for CPU backend