Closed Ph0rk0z closed 1 year ago
I have tried to get things to load for GPTQv2 models that do not use groupsize. Appears it is impossible because groupsize is used to switch between V1 and V2.
updated cuda implementation to support g_idx. I think it'll work now.
Working great now that I updated the cuda kernel too :)
Thank you!
I have tried to get things to load for GPTQv2 models that do not use groupsize. Appears it is impossible because groupsize is used to switch between V1 and V2.