Closed brucethemoose closed 7 months ago
Found it, its 99f6ac30373c29d3fae2bccb846e45497153008d that breaks quantizing.
89885be0feee057d0ac4b29c9b23458ae88328e3 works.
Specifically this change here?
Separate note, the new optimization works great. I used to OOM at the very end with this command (and had to go in and edit the gpu flag for the job), but now it completes with the same command :+1:
I imagine you've got the exllamav2 package installed, and you've pulled the latest changes but not rebuilt the extension. pip uninstall exllamav2
should do it, then it should build use the JIT version while quanting, which has the updated function prototype for rope_
.
That sounds like exactly what happened, thanks.
I am getting an error quantizing a model with a command that worked about 9 days ago, using the same measurements json created 9 days ago.
Gonna roll back the commits and try to find the one that fixes it.
Command used:
Log: