Add QuaRot (no quantization yet)

nailimixaM commented 4 months ago

This PR adds the minimum required to apply QuaRot to a Llama-2 7b model to be act-and-weight quantizeable, without actually doing any quantization with rtn/gptq.

nailimixaM commented 4 months ago

@jameshensman I pushed our fixes from last week (refactoring hadamard) which fixed the PR build, let me know what you think and if it's all good I'll merge into quarot main.

jameshensman commented 4 months ago

Approved.

On Wed, 1 May 2024 at 14:03, Maximilian Croci @.***> wrote:

@jameshensman https://github.com/jameshensman I pushed our fixes from last week (refactoring hadamard) which fixed the PR build, let me know what you think and if it's all good I'll merge into quarot main.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/TransformerCompression/pull/142#issuecomment-2088432846, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABFWD3TC3NLJK4M5AGDGRTZADRZRAVCNFSM6AAAAABGY6PZXSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBYGQZTEOBUGY . You are receiving this because you were mentioned.Message ID: @.***>

microsoft / TransformerCompression

Add QuaRot (no quantization yet) #142