Intelligent-Computing-Lab-Yale / TesseraQ

Apache License 2.0
11 stars 1 forks source link

Missing run_awq_llama.sh #1

Open RanchiZhao opened 6 days ago

RanchiZhao commented 6 days ago

Hi, thanks for your work. However, I cannot find the 'run_awq_llama.sh', am i missing sth?

yhhhli commented 6 days ago

Hi, we just uploaded the scripts.

RanchiZhao commented 5 days ago

Hello, I've completed the setup for AWQ according to the tutorial, but when I moved on to step two to acquire the TesseraQ model, I found that using save_fp, save_autogptq, or save_lightllm didn't work as expected.

yhhhli commented 5 days ago

Are you trying to export a quantized model? Consider following the tutorials of the original LLMC framework.

RanchiZhao commented 3 hours ago

Sorry for the delay in replying. I noticed that when I use tesseraq for w4a16 quantization, the reconstruction loss in the last few layers is surprisingly high. For example, the second to last layer dropped from 100 to 70 after using tesseraq, but the last layer dropped from 1300 to 650, while the initial layers were only around 2. Do you have any suggestions? Does this indicate that the quantization is not going smoothly?