efficient-inference Search Results

1000+ results
for efficient-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Aidenzich/road-to-master #64

LayerSkip: Enabling Early Exit Inference and Self-Speculativ…

- https://arxiv.org/pdf/2404.16710 - Diagram ![Screenshot 2024-10-30 at 9 29 59 PM](https://github.com/user-attachments/assets/425cf827-0a2d-4ac4-9884-1a454e0e6b04)

Aidenzich updated 1 month ago
2
huggingface/speech-to-speech #104

Approach for enabling multi client connection

I'd like to explore the best approach for managing multi-client connections in both single and multi-GPU environments. Often, GPUs are underutilized by a single client, especially when smaller mode…

kdcyberdude updated 2 months ago
4
NVIDIA/TensorRT-Model-Optimizer #108

[RFC] TensorRT Model Optimizer - Product Roadmap

# TensorRT Model Optimizer - Product Roadmap [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) (ModelOpt)’s north star is to be the best-in-class model optimization toolki…

hchings updated 6 days ago
5
number9473/nn-algorithm #124

Pruning Convolutional Neural Networks for Resource Efficient…

# Pruning Convolutional Neural Networks for Resource Efficient Inference # - Author: Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz - Origin: https://arxiv.org/abs/1611.06440 -…

joyhuang9473 updated 6 years ago
7
ZHO-ZHO-ZHO/ComfyUI-YoloWorld-EfficientSAM #72

The following operation failed in the TorchScript interprete…

Error occurred when executing Yoloworld_ESAM_Zho: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): RuntimeError: invalid vector sub…

yhf1407 updated 4 months ago
3
vllm-project/vllm #5937

Virtual Office Hours: July 9 and July 25

## vLLM Virtual Open Office Hours We enjoyed seeing everyone at the previous office hours and got great feedback. These office hours are a ~bi-weekly live event where you come to learn more about t…

mgoin updated 2 weeks ago
3
blackjax-devs/blackjax #718

Doc: add Stan PPL integration?

Add Stan PPL integration to use Stan models with Blackjax inference algorithms With the [BridgeStan](https://roualdes.github.io/bridgestan/latest/) library, we can efficiently access log density an…

gil2rok updated 2 months ago
3
urchade/GLiNER #88

Advices for inference speedup

Hi team, I'm running inference on a g5.24xlarge GPU instance. The data is currently structured in a Pandas dataframe. I use Pandas apply method to apply the predict_entities function. When the df g…

yishusong updated 3 months ago
14
Standard-Intelligence/hertz-dev #11

How to run Inference?

Config: Windows 10 with RTX4090 All requirements incl. flash-attn build - done! Server: ``` (venv) D:\PythonProjects\hertz-dev>python inference_server.py Using device: cuda Loaded tokeniz…

SuperMaximus1984 updated 3 weeks ago
8
vllm-project/vllm #3654

[RFC] Initial Support for CPUs

## Progress - [ ] Integrate CPU executor to support the basic model inference (BF16/FP32) without TP. - #3634 - #3824 - #4113 - #4971 - #5452 - #5446 - [ ] Support FP16 mo…

bigPYJ1151 updated 3 weeks ago
11

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for efficient-inference

1000+ results
for efficient-inference