facebookresearch / chai

CHAI is a library for dynamic pruning of attention heads for efficient LLM inference.
GNU General Public License v3.0
9 stars 0 forks source link