能给一下相关的评估数据吗

chu-tianxiang / exl2-for-all

EXL2 quantization generalized to other models.

8 stars 2 forks source link

能给一下相关的评估数据吗 #1

Open yzzzwd opened 11 months ago

yzzzwd commented 11 months ago

如题，想知道不同模型在量化之后精度损失情况最好还能给一下量化前后的速度对比

yzzzwd commented 11 months ago

@chu-tianxiang

yzzzwd commented 11 months ago

@chu-tianxiang

chu-tianxiang commented 11 months ago

I have not yet conducted a comprehensive experiment comparing perplexity vs bits. However, based on my preliminary results with LLaMA-2-7B, the 2.5bits model exhibits a significantly higher perplexity.

yzzzwd commented 11 months ago

能说中文吗，英文看不太懂

yzzzwd commented 11 months ago

In your chart, it seems that the performance of 4-bit is similar to that of 16-bit. However, the performance significantly deteriorates when it drops to 2.5-bit.Why?

yzzzwd commented 10 months ago

@chu-tianxiang