Closed fabianlim closed 1 month ago
Completing more items in #25 .
Verified that we can reproduce the roughly 20% speedups using fused-ops and kernels
Verified that we are reproduce the 75% in memory reduction using 4bit base weights
running a set of benches now. will merge after complete
@achew010 pls update if you have obtained the new benches.
Completing more items in #25 .
Verified that we can reproduce the roughly 20% speedups using fused-ops and kernels
Verified that we are reproduce the 75% in memory reduction using 4bit base weights