llm-compression Search Results

508 results
for llm-compression

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/LLMLingua #172

[Question]: How to utilize compression to a finetuned LLM?

### Describe the issue I have an LLM finetuned for a down-stream task using input-output pairs data(`X_train` - `Y_train`). Now I plan to utilize llmlingua2 to compress `X_test` --> `X_test_compre…

LiuZhihhxx updated 5 hours ago
1
neuralmagic/sparseml #2351

One-Shot LLM Compression bug with obcq recipe.yaml

Hi, i try your example from the main page: git clone https://github.com/neuralmagic/sparseml pip install -e "sparseml[transformers]" wget https://huggingface.co/neuralmagic/TinyLlama-1.1B-Chat-v…

Kaiseem updated 1 week ago
1
neo4j-labs/llm-graph-builder #568

Embedding cost increases with CHAT_SEARCH_KWARG_K

I noticed that when I set CHAT_SEARCH_KWARG_K too high my embedding model cannot handle too many request, however I don't understand why this happen, as chunks are already embedded and question is sho…

maxgosk updated 1 week ago
9
Gaffey/ExCP #6

The compression process abruptly aborted (8*1b model)

Thank you very much for your work! I encountered a problem during the compression of 8*1b moe model and I wanna know if you encountered it on LLM and your solution. Any reply would be appreciated. …

NiuMa-1234 updated 5 hours ago
2
microsoft/LLMLingua #171

[Question]: Fail to reproduce llmlingua on meetingbank

### Describe the issue Thanks for the interesting work. I tried to reproduce the results of llmlingua on the meetingbank QA dataset with Mistral-7B as the target LLM. The small LLM I use is https…

jzhang538 updated 4 hours ago
1
open-compass/opencompass #1222

[Bug] llm-compression task faild at eval stage with latest v…

### Prerequisite - [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expe…

mqy004 updated 1 month ago
3
Deelvin/mlc-llm #4

Study SOTA of LLM compression

Study SOTA approaches and modern papers: 1. [SmoothQuant](https://arxiv.org/pdf/2211.10438.pdf) [github](https://github.com/mit-han-lab/smoothquant) 2. [AWQ](https://arxiv.org/pdf/2306.00978.pdf) [gi…

vvchernov updated 8 months ago
1
langchain-ai/langchain #24437

complete prompt is appended at the start of my response gene…

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the LangChain documentation with the integrated search. - [X] I used the GitHub search to find a sim…

ibtsamraza updated 1 week ago
1
xorbitsai/inference #1906

qwen1.5-moe-chat模型加载失败

### System Info / 系統信息 Python: Python 3.10.14 os: ``` DISTRIB_ID=Kylin DISTRIB_RELEASE=V10 DISTRIB_CODENAME=kylin DISTRIB_DESCRIPTION="Kylin V10 SP1" DISTRIB_KYLIN_RELEASE=V10 DISTRIB_VER…

li1553770945 updated 1 hour ago
5
vllm-project/vllm #5751

[RFC]: Support sparse KV cache framework

### Motivation For current large model inference, KV cache occupies a significant portion of GPU memory, so reducing the size of KV cache is an important direction for improvement. Recently, severa…

chizhang118 updated 2 weeks ago
15

上一页 1...1 2 3 4 5 6 7...51 下一页

508 results for llm-compression

508 results
for llm-compression