Open ccccj opened 9 months ago
I mean, is it possible that I perform AWQ search again and will get better performance?
I mean, is it possible that I perform AWQ search again and will get better performance?
hi, I have similar issue. Do you have any progress?
Hello, I have a model of llama that has been fine-tuned using my own dataset. Is it possible for me to compress the fine-tuned llama model and will the accuracy drop due to my fine-tuning?
No, I used another method of compressing the model(SpQR), so I didn't use this one
Hello, I have a model of llama that has been fine-tuned using my own dataset. Is it possible for me to compress the fine-tuned llama model and will the accuracy drop due to my fine-tuning?
No, I used another method of compressing the model(SpQR), so I didn't use this one
Did SpQR achieve good results ? btw, I heard that SpQR doesn't have the corresponding CUDA kernel implemented.
Hello, I have a model of llama that has been fine-tuned using my own dataset. Is it possible for me to compress the fine-tuned llama model and will the accuracy drop due to my fine-tuning?
No, I used another method of compressing the model(SpQR), so I didn't use this one
Did SpQR achieve good results ? btw, I heard that SpQR doesn't have the corresponding CUDA kernel implemented.
spqr looks ok from the score (my application scenario doesn't require the implementation of a cuda kernel). Maybe we can communicate by e-mail or WeChat.
Hello, I have a model of llama that has been fine-tuned using my own dataset. Is it possible for me to compress the fine-tuned llama model and will the accuracy drop due to my fine-tuning?
No, I used another method of compressing the model(SpQR), so I didn't use this one
Did SpQR achieve good results ? btw, I heard that SpQR doesn't have the corresponding CUDA kernel implemented.
spqr looks ok from the score (my application scenario doesn't require the implementation of a cuda kernel). Maybe we can communicate by e-mail or WeChat.
Sure, My WeChat is all lowercase words of my github nickname.
Hello, I have a model of llama that has been fine-tuned using my own dataset. Is it possible for me to compress the fine-tuned llama model and will the accuracy drop due to my fine-tuning?