snap-research / BitsFusion

https://snap-research.github.io/BitsFusion/
117 stars 1 forks source link

inference speed comparison ?? #4

Open eisneim opened 3 months ago

eisneim commented 3 months ago

the compression ratio is impressive! nice work 👍🎉, but i didn't find any inference speed comparison in the paper.

it would be nice to show us how is the generation time cost has reduced with such quantization, thanks!