efficient-inference Search Results

1000+ results
for efficient-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #8070

[New Model]: FM9GForCausalLM

### Your current environment my vllm version is pip show vllm Name: vllm Version: 0.3.3+git3380931.abi0.dtk2404.torch2.1 Summary: A high-throughput and memory-efficient inference and serving eng…

Aiwenqiuyu updated 1 day ago
2
feast-dev/feast #4710

Performance bottleneck in get_online_features due to repeate…

**Is your feature request related to a problem? Please describe.** I'm running some benchmarks with Python SDK and profiling the code to understand more about its execution. Here's the profile report…

breno-costa updated 1 month ago
10
m-bain/whisperX #159

More efficient batch inference resulting in large-v2 with *6…

The `README.md` says "more efficient batch inference resulting in large-v2 with *60-70x REAL TIME speed (not provided in this repo)". Will this eventually be integrated into this repo, too? That w…

DavidFarago updated 3 months ago
32
Yaxin9Luo/Gamma-MOD #7

Could you please clarify the rationale for visualizing only …

![image](https://github.com/user-attachments/assets/c1a77b7e-049f-4d87-9ac9-ff71098462d1) Thank you for your insightful work on γ-MoD. I have a question regarding Figure 4 in your paper. Could y…

Ethan-Chen-plus updated 5 days ago
3
huggingface/transformers #32435

[i18n-ar] Translating docs to Arabic (العربية)

Hi! !مرحبا! السلام عليكم Let's bring the documentation to all the Arabic-speaking community 🌏 (currently 0 out of 267 complete) Would you want to translate? Please follow the 🤗 [TRANSLATING guid…

AhmedAlmaghz updated 10 hours ago
3
opennars/OpenNARS-for-Applications #288

The high performance in the `bandrobot` test may be accident…

## Background The `bandrobot` test, which is one of a demo in ONA, is aiming to test the multistep event inferencing/subgoaling of ONA reasoner (by NAL-7 & NAL-8 temporal/procedural inferencing) …

ARCJ137442 updated 1 month ago
2
blackjax-devs/blackjax #718

Doc: add Stan PPL integration?

Add Stan PPL integration to use Stan models with Blackjax inference algorithms With the [BridgeStan](https://roualdes.github.io/bridgestan/latest/) library, we can efficiently access log density an…

gil2rok updated 2 months ago
3
kubeedge/ianvs #126

Cloud-edge collaborative speculative decoding for LLM based …

- Description: - The autoregressive decoding mode of LLM determines that LLM can only be decoded serially, which limits its inference speed. Speculative decoding technique can be used to decode L…

hsj576 updated 3 months ago
15
elastic/elasticsearch #109080

[Inference API] Add pagination and sorting options to GET _i…

### Description The kibana team has requested that we add pagination and sorting options to the`GET _inference/_all` API to efficiently handle these operations in the backend. Currently, they have ad…

maxhniebergall updated 6 months ago
1
QwenLM/Qwen2.5 #1057

A simplified version of the inference code ?

### Has this been supported or requested before? - [X] I have checked [the GitHub README](https://github.com/QwenLM/Qwen2.5). - [X] I have checked [the Qwen documentation](https://qwen.readthedocs…

weizhenhuan updated 3 days ago
2

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for efficient-inference

1000+ results
for efficient-inference