-
### Feature request
This request aims to introduce functionality to delete specific adapter layers integrated with PEFT (Parameter-Efficient Fine-Tuning) within the Hugging Face Transformers librar…
-
![image](https://github.com/user-attachments/assets/107e5738-8b12-42a1-8229-d33c1f35dc3d)
There are no Official Quantized Model So Will Team Support Quantized Version Official or Not ? As you can s…
-
Unable to run performance analyzer on my model
I am using a sagemaker wrapper image of triton server and am able to serve the model with requests and even validate that it is up, all ports for grpc, …
-
### Description
We are updating the model matrix testing and need the main page updated here to reflect performance tests: https://www.elastic.co/guide/en/security/current/llm-performance-matrix.html…
-
### What is the issue?
I've been using llama.cpp recently to run large models, some of which exceed my GPU's VRAM capacity. With llama.cpp, when I run models that are too large to fully fit in VRAM, …
-
### Proposal to improve performance
_No response_
### Report of performance regression
For the same query, the stats are:
**Llama2-7b-hf model:** 47.52s/it, est. speed input: 4.76 toks/s, output…
-
Performance Management Processes
- Performance Management Assessments
- Performance Improvement Plan (PIP)
## 5 Most Important Performance Management Models
1. Traditional
The annual performanc…
-
hi, thanks for your great work. I found the results on docvqa and textvqa are much better than other open source models. Can you share any insights or instructions for improving model ocr performance?
-
Hi,
Thank you for the kind introduction for your code!
By the way, I've reproduced 3D Diffuser Actor on RLBench, obtaining slightly different performance with the results in the paper.
Well, it…
-
Hello,
We noticed that for bigger menus the GraphQL response is too slow.
After some investigation we found that this method https://github.com/SnowdogApps/magento2-menu/blob/develop/Model/GraphQl…