I tried to use the 4bit-plugin branch and add it myself, but I hardly know python. I made the changes to fit what it seemed to me like you are doing (in that other repo you have--I forget the name of it rn), and it built the modules, but I can't get them to work. It also needs exllama from outside the tree, I think, and I'm not sure what to do with that, either.
If you don't have time to do it yourself, please briefly explain what I have to do (assuming the branch is actually working, heh).
You can find it here: https://github.com/eugenepentland/landmark-attention-qlora.git
I tried to use the 4bit-plugin branch and add it myself, but I hardly know python. I made the changes to fit what it seemed to me like you are doing (in that other repo you have--I forget the name of it rn), and it built the modules, but I can't get them to work. It also needs exllama from outside the tree, I think, and I'm not sure what to do with that, either.
If you don't have time to do it yourself, please briefly explain what I have to do (assuming the branch is actually working, heh).
Thanks.