Open chenwanqq opened 1 month ago
Hi @chenwanqq, I don't have any plans for this at the moment as I'm focusing on adding new quants (#467, #546) for a bit.
Those techniques sound super interesting, and I'd be happy to merge any contributions for this.
@chenwanqq I just merged GPTQ support!
@chenwanqq I just merged GPTQ support!
Sorry, I have intensive job interviews these days, so the progress is kind of slow. I will pick up this part very soon!
@chenwanqq no worries!
Hi, I'm wondering if you have any plans regarding kv compression methods like SnapKV and PyramidKV. These methods can reduce the use of memory for KV cache, hence improving availability on low-memory machines. Maybe I can make some contributions to this.