microsoft / TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications
MIT License
354 stars 31 forks source link

Quarot: DeepSeek-V2 Support #174

Open RanchiZhao opened 1 month ago

RanchiZhao commented 1 month ago

especially for MLA, how to put Q on MLA?