Closed ikawrakow closed 1 month ago
This PR adds the ability to use Q4_0, Q4_1 and Q8_0 for the kv-cache.
This PR adds the ability to use Q4_0, Q4_1 and Q8_0 for the kv-cache.