Closed kiucho closed 1 year ago
Hi, I have a question about calibration data (128, 2048 tokens, respectively)
Is there a particular reason to use 2048 tokens for each data? I tracked SparseGPT, GPTQ, but I couldn't find it. I hope I can get some insight.
Thank you.
2048 is the maximum context size of the OPT and LLaMA models. In other words, these LLMs can process a sequence with up to 2048 tokens.
Hi, I have a question about calibration data (128, 2048 tokens, respectively)
Is there a particular reason to use 2048 tokens for each data? I tracked SparseGPT, GPTQ, but I couldn't find it. I hope I can get some insight.
Thank you.