microsoft / TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications
MIT License
354 stars 31 forks source link

Remove monkeypatch from QuaRot source #134

Closed nailimixaM closed 4 months ago

nailimixaM commented 4 months ago

We want to add a method which is called immediately after RoPE (apply_rotary_pos_emb) in Llama's attention module.

Key considerations:

nailimixaM commented 4 months ago

Superceded by #137.