fail to apply on llama-13b

AIoT-MLSys-Lab / SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

https://arxiv.org/abs/2403.07378

Apache License 2.0

68 stars 6 forks source link

fail to apply on llama-13b #6

Closed 33answer33 closed 2 months ago

33answer33 commented 2 months ago

Hello,I have some trouble to reproduce the results on llama-13b.An error "scaling_matrix_inv = torch.linalg.inv(scaling_diag_matrix) torch._C._LinAlgError: linalg.inv: The diagonal element 6940 is zero, the inversion could not be completed because the input matrix is singular" occurs on line 203, in whitening function. How can I sovle this problem? Thanks.

tuidan commented 2 months ago

Hi, thank you for your feedback! In this case, it's better to use double() precision rather than the float() precision to run cholesky decomposition like the following way: scaling_diag_matrix = torch.linalg.cholesky(raw_scaling_diag_matrix.double())

I have updated the code to fix this potential problem. You can compress LLaMA-13b using the following command: python SVDLLM.py --model jeffwan/llama-13b-hf --step 1 --save_path "./"

If you still met the same problem when running the new code, please reopen this issue.

33answer33 commented 2 months ago

I still got the same problem. Traceback (most recent call last):
File "/home/xxx/SVD-LLM/SVDLLM_new.py", line 193, in whitening
scaling_matrix_inv = torch.linalg.inv(scaling_diag_matrix)
torch._C._LinAlgError: linalg.inv: The diagonal element 6940 is zero, the inversion could not be completed because the input matrix is singular. " My python environment is built on requirements.txt. And I run the code on 2 3090 GPUs