Open znsoftm opened 1 year ago
ggml.ai can quantize a model to int4/8, and can seed up the inference of a model.
https://github.com/ggerganov/ggml
ggml.ai can quantize a model to int4/8, and can seed up the inference of a model.