efficient-inference Search Results

1000+ results
for efficient-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Project-MONAI/MONAI #6427

Memory efficient sliding window inference

**Is your feature request related to a problem? Please describe.** Large input volumes have to be processed via a sliding window algorithm, otherwise OOMs can happen quickly. There are two constraini…

razorx89 updated 1 year ago
6
openvinotoolkit/openvino_notebooks #1648

a new Visual LLM project that is suitable for creating openv…

This is a visual model depend on phi-2, I believe OpenVINO should be able to support it and perform efficient inference on personal computers, it will be very useful . [TinyGPT-V: Efficient Multi…

sanbuphy updated 3 weeks ago
1
AIDC-AI/Ovis #36

Ovis1.6-Llama3.2-3B-GPTQ-Int4. How can it be inferred using …

Ovis1.6-Llama3.2-3B-GPTQ-Int4. How can it be inferred using the CPU?

peterlong2003 updated 4 days ago
1
huggingface/transformers #33810

Improve image processing time

### Feature request Optimize Transformers' image_processors to decrease image processing time, and reduce inference latency for vision models and vlms. ### Motivation The Transformers library relie…

yonigozlan updated 1 month ago
1
dusty-nv/jetson-containers #687

Upgrade from 6.0 to 6.1 breaks WebRTC - coredump - 'sinkpad'…

@dusty-nv thanks for NanoLLM for CUDA=12.6 - works well!! However, when I invoke it with: ``` sudo jetson-containers run $(autotag nano_llm) \ python3 -m nano_llm.agents.video_query --api=…

TangmereCottage updated 4 weeks ago
6
opea-project/GenAIComps #831

[RFC] OPEA Inference Microservices Integration for LangChain…

# OPEA Inference Microservices Integration for LangChain This RFC proposes the integration of OPEA inference microservices (from GenAIComps) into LangChain [extensible to other frameworks], enabli…

avinashkarani updated 3 weeks ago
2
FunAudioLLM/SenseVoice #147

The demo.py can not work correctly

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节） ## 🐛 Bug When I run the demo.py , the error is : ``` Tracebac…

chongkuiqi updated 1 month ago
1
DeepLabCut/DeepLabCut #2732

Create projects from SuperAnimal models

Dear DeepLabCut Team, Thank you for your tremendous work on this invaluable open-source tool. I truly appreciate your efforts and dedication. I have a suggestion regarding the use of pre-trained…

codikzng updated 1 month ago
1
vllm-project/vllm #9464

[Feature]: Alternating local-global attention layers

### 🚀 The feature, motivation and pitch Gemma-2 and new Ministral models use alternating sliding window and full attention layers to reduce the size of the KV cache. The KV cache is a huge inferen…

griff4692 updated 1 month ago
1
lenskit/lkpy #495

High-performance recommender output storage

Right now, in experiments I have been running, there is a significant bottleneck in retrieving and saving results in parallel batch inference. This is significantly hindering throughput, as each worke…

mdekstrand updated 1 month ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for efficient-inference

1000+ results
for efficient-inference