Question about Calibration Data

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

MIT License

2.51k stars 198 forks source link

Question about Calibration Data #101

Open rainyBJ opened 1 year ago

rainyBJ commented 1 year ago

Hi, I have a question about the calibration data: In calib_data.py, you re-organize the calib data so that every batch has the same sequence length and there's no need for padding. But will this affect the positional_embedding and further affect the data distribution during calibration? And furthermore, will padding tokens with attention mask affect the calibration process when I have to pad the data so that they can have the same length?

vince62s commented 1 year ago

Same question and also does it matter to use 512 seqlen ? What if we have ram constraint will work the same way with 128 or 256 tokens ?

tonylins commented 1 year ago

Hi, we have not extensively ablated the use of calibration sets. Feel free to try it out and compare the performance!

vince62s commented 1 year ago

For small LM (7B) or NMT encoder-decoder models (< 1B) I found out that scaling / clipping was useless, it just worked only while quantizing directly. Does it make sense?

tonylins commented 1 year ago

For small LM (7B) or NMT encoder-decoder models (< 1B) I found out that scaling / clipping was useless, it just worked only while quantizing directly. Does it make sense?

The scaling/clipping works for Llama-7B models. I am not sure about the smaller ones or enc-dec models.

vince62s commented 1 year ago

I am not saying it does not work :) I am saying that WITHOUT it, it work too ....

hhxxttxsh commented 11 months ago

For small LM (7B) or NMT encoder-decoder models (< 1B) I found out that scaling / clipping was useless, it just worked only while quantizing directly. Does it make sense?

Just curious: how small are the NMT encoder-decoder models that you tried? Do they have transformers in either encoder or decoder?

vince62s commented 11 months ago

it was a base transformer.

daneren commented 7 months ago

Hi, is there any new progress on this issue? I also encountered the same problem.