llm-compression Search Results

open-compass/opencompass #1222

[Bug] llm-compression task faild at eval stage with latest v…

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expe…

mqy004 updated 1 month ago

flashinfer-ai/flashinfer #367

[Feature request] Sparse Attention

Recently, we see several awesome work focusing on kv cache compressing and they said can accelearte 1.7~2.3 times than FlashInfer, can you guys plz consider to surpport such features? Same layer KV…

Ageliss updated 3 hours ago

vllm-project/vllm #5751

[RFC]: Support sparse KV cache framework

### Motivation For current large model inference, KV cache occupies a significant portion of GPU memory, so reducing the size of KV cache is an important direction for improvement. Recently, severa…

chizhang118 updated 6 days ago

Deelvin/mlc-llm #4

Study SOTA of LLM compression

Study SOTA approaches and modern papers: 1. [SmoothQuant](https://arxiv.org/pdf/2211.10438.pdf) [github](https://github.com/mit-han-lab/smoothquant) 2. [AWQ](https://arxiv.org/pdf/2306.00978.pdf) [gi…

vvchernov updated 7 months ago

microsoft/autogen #2583

[Feature Request]: Add TransformMessages support to GroupCha…

### Is your feature request related to a problem? Please describe. GroupChat uses a nested conversation between two agents. Currently it does not utilise the recent TransformMessages capability nor…

marklysze updated 1 month ago

langfuse/langfuse #2155

bug: run not found and Generations are not stacked under Spa…

### Describe the bug When using Langchain ContextualCompressionRetriever, "run not found" was raised. ``` Traceback (most recent call last): File "/lib/python3.11/site-packages/langfuse/cal…

nathan-vo810 updated 1 week ago

froggy1014/langchainjs-kr #2

번역해야할 문서 리스트

## List - tutorials - [ ] #4 - @seochan99 - [ ] #5 - @seochan99 - [ ] #6 - @seochan99 - [ ] #17 - @bananana0118 - [ ] graph.mdx - [ ] index.mdx - [ ] llm_chain.mdx - [ ]…

froggy1014 updated 3 weeks ago

Aaronhuang-778/BiLLM #14

Issue with code replication

Hello, first and foremost, I want to thank you for your incredible work! I'd like further information on how to reproduce your code. I followed the code instructions in your README, but I am unabl…

Devy99 updated 1 month ago

THUDM/AgentBench #130

Excellent Job! Well, no offense, it seems LLM-Bench rather t…

Sorry to raise the problem but give no systematic analysis It may be about to take me more time on more complete investigation over the "compression" ability of LLM as many may be support "compressio…

Konisberg updated 3 months ago

microsoft/LLMLingua #155

[Question]: Reproduce LLMLingua-2 results with Mistral-7B

### Describe the issue First of all, thank you for your great contributions. I have a similar question to the [issue 146](https://github.com/microsoft/LLMLingua/issues/146), I cannot reproduce the…

xvyaward updated 1 month ago

479 results for llm-compression

479 results
for llm-compression