llm-compression Search Results

614 results
for llm-compression

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

terratensor/regular-library #3

Поиск по [1] названию, [2] автору, [3] тексту выбранной книг…

> Подумал, что мы же можем применить те же регулярные выражения и даже методы callkewords и qcallsuggest из мантикоры и, например, запустить новый процесс парсинга, который будет каждый …

audetv updated 1 month ago
23
vllm-project/llm-compressor #107

SmoothQuant doesn't work with cpu offloading

**Describe the bug** When using a SmoothQuantModifier and cpu offloading there is a conflict of tensors not being on the right device. **Expected behavior** cpu offloading should work w/ SmoothQu…

anmarques updated 3 days ago
2
hashicorp/terraform-provider-aws #34795

[Enhancement]: aws_sagemaker_model resource missing ModelAcc…

### Terraform Core Version 1.5.7 ### AWS Provider Version 5.29.0 ### Affected Resource(s) resource: aws_sagemaker_model ### Expected Behavior I'm attempting to create an AWS Sagem…

akirfman updated 1 week ago
9
huggingface/alignment-handbook #16

Memory Issue with 7b Model Fine-Tuning on 6 H100 GPUs

Hello everyone, I'm encountering a memory issue while fine-tuning a 7b model (such as Mistral) using a repository I found. Despite having 6 H100 GPUs at my disposal, I run into out-of-memory errors wh…

apt-team-018 updated 9 months ago
4
microsoft/MInference #7

[Question]: Hope to supplement the situation of increasing H…

### Describe the issue In fact, there are currently numerous works that expand the context, but as the context expands, the KV cache increases, leading to a sharp rise in HBM usage. Therefore, whet…

Arcmoon-Hu updated 3 months ago
2
microsoft/TaskWeaver #351

Is it possible to use ollama embedding model while using Ope…

**Is your feature request related to a problem? Please describe.** Is it possible to use Ollama embedding model for plugin selection while using OpenAI model for agents. See my config file blow: { …

SingTeng updated 5 months ago
2
pytorch/pytorch #137779

Flex attention with mask depending on queries and keys lengt…

### 🐛 Describe the bug I tried to implement the `causal_lower_right` masking in flex attention. This requires the masking function to know the difference in lengths of keys and queries: ```python …

janchorowski updated 1 week ago
2
langchain-ai/langchainjs #6926

PineconeStore namespace filter not working PineconeArgumentE…

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the LangChain.js documentation with the integrated search. - [X] I used the GitHub search to find a …

lynicis updated 2 weeks ago
1
openvinotoolkit/nncf #2755

IndexError: list index out of range When I try to quantize …

### 🐛 Describe the bug Hello, I'm trying to quantize llama models using `OVQuantizer` but I'm facing an error: ``` IndexError: list index out of range ``` I tried llama3 and llama2 ###…

Alwahsh updated 4 months ago
2
run-llama/llama_index #13769

[Bug]: MilvusVectorStore failed to connect to the database w…

### Bug Description The MilvusVectorStore failed to connect when enable_sparse is True. when i set it to false it can connect. ### Version 0.10.38 ### Steps to Reproduce you have just to do: ```…

osafaimal updated 4 months ago
5

上一页 1...22 23 24 25 26 27 28...62 下一页

614 results for llm-compression

614 results
for llm-compression