foundation-model-stack / fms-acceleration

🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.
Apache License 2.0
0 stars 4 forks source link

Add MLP & QLoRA Fused Ops and Kernels, Mixtral #29

Closed fabianlim closed 1 month ago

fabianlim commented 1 month ago

Completing more items in #25 .

Verified that we can reproduce the roughly 20% speedups using fused-ops and kernels

Verified that we are reproduce the 75% in memory reduction using 4bit base weights

fabianlim commented 1 month ago

running a set of benches now. will merge after complete

fabianlim commented 3 weeks ago

@achew010 pls update if you have obtained the new benches.