Open eisneim opened 3 months ago
the compression ratio is impressive! nice work 👍🎉, but i didn't find any inference speed comparison in the paper.
it would be nice to show us how is the generation time cost has reduced with such quantization, thanks!
the compression ratio is impressive! nice work 👍🎉, but i didn't find any inference speed comparison in the paper.
it would be nice to show us how is the generation time cost has reduced with such quantization, thanks!