I encountered the same issue: torch._C._LinAlgError: linalg.cholesky

          I encountered the same issue:

https://github.com/vllm-project/llm-compressor/issues/109 https://github.com/vllm-project/llm-compressor/issues/142

torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 17915 is not positive-definite).

Use enough calibration data (≥ 5k samples; model: Qwen2.5-7B-Instruct)
Shuffling data so the samples are in a different order
I adjusted the dampening_frac, which allowed me to complete the model quantization, but running the VLLM inference caused the WSL Ubuntu 22.04 to crash.(No exception information was captured)

In the end, my solution was to switch back to the older version 0.1.0, which resolved the issue, although the quantization process is quite slow.

datasets: belle_resampled_78_k_cn-train ultrachat_200k open-platypus AI-MO_NuminaMath-CoT

Unfortunately, without being able to reproduce the setup, I can't provide more help beyond these few suggestions to try:

Use enough calibration data (≥ 512 samples; if possible, please try 2k, 3k, or 4k as well).

Once you have enough calibration data, try shuffling it so the samples are in a different order.

If steps 1 and 2 don't help, try gradually increasing dampening_frac. Be aware that this should be the last option, as increasing dampening_frac makes your GPTQ algorithms more similar to round-to-nearest quantization, which negatively impacts accuracy.

Originally posted by @okwinds in https://github.com/vllm-project/llm-compressor/issues/142#issuecomment-2395811942

vllm-project / llm-compressor

I encountered the same issue: torch._C._LinAlgError: linalg.cholesky #823