Open O-O1024 opened 1 year ago
if it supports int8 quantization, the model size can be reduced 3/4. May be faster.
Yes, it is twice as small as the 16-bit model, but the quality is terrible, I must admit.
if it supports int8 quantization, the model size can be reduced 3/4. May be faster.