chu-tianxiang / exl2-for-all

EXL2 quantization generalized to other models.
8 stars 2 forks source link

能给一下相关的评估数据吗 #1

Open yzzzwd opened 11 months ago

yzzzwd commented 11 months ago

如题,想知道不同模型在量化之后精度损失情况 最好还能给一下量化前后的速度对比

yzzzwd commented 11 months ago

@chu-tianxiang

yzzzwd commented 11 months ago

@chu-tianxiang

chu-tianxiang commented 11 months ago

I have not yet conducted a comprehensive experiment comparing perplexity vs bits. However, based on my preliminary results with LLaMA-2-7B, the 2.5bits model exhibits a significantly higher perplexity.

下载

yzzzwd commented 11 months ago

能说中文吗,英文看不太懂

yzzzwd commented 11 months ago

In your chart, it seems that the performance of 4-bit is similar to that of 16-bit. However, the performance significantly deteriorates when it drops to 2.5-bit.Why?

yzzzwd commented 10 months ago

@chu-tianxiang