pratyushasharma / laser

The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
https://pratyushasharma.github.io/laser/
MIT License
365 stars 28 forks source link

Rank-reduced models? #8

Open turboderp opened 10 months ago

turboderp commented 10 months ago

Do you publish the rank-reduced models anywhere?

dkmisra commented 10 months ago

We dont release them but one can recover them by taking the standard huggingface models and applying the right LASER intervention. The list of optimal intervention is in Table 3 of the paper https://arxiv.org/pdf/2312.13558.pdf.

I suppose it can be a good feature to add to get the LLM model with chosen hyperparameters for people to reproduce our results. I am making a list of features for the upcoming refactoring. I'll add this to it.

Let me know if you have more questions.

dkmisra commented 10 months ago

Related to #9

YixinSong-e commented 10 months ago

How much memory requirement can we reduce if we store weights rank-reduced format?

dkmisra commented 10 months ago

If you have a mxn matrix then your memory is mn. If you reduce the rank down to k using SVD, then you will store mk + k^2 + kn. Imagine k=1, then this comes down to just m + n + 1. If you reduce it to 1% of the max rank (and say m > n and so max rank is n), then you have k=n/100 and you get mn/100 + (n/100)^2 + n^2/100 <= mn/50 + mn/(10000) ~ mn/50, so about 50 times shrinkage.