Open yzzzwd opened 11 months ago
@chu-tianxiang
@chu-tianxiang
I have not yet conducted a comprehensive experiment comparing perplexity vs bits. However, based on my preliminary results with LLaMA-2-7B, the 2.5bits model exhibits a significantly higher perplexity.
能说中文吗,英文看不太懂
In your chart, it seems that the performance of 4-bit is similar to that of 16-bit. However, the performance significantly deteriorates when it drops to 2.5-bit.Why?
@chu-tianxiang
如题,想知道不同模型在量化之后精度损失情况 最好还能给一下量化前后的速度对比