efficient-llm Search Results

1000+ results
for efficient-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mlc-ai/mlc-llm #2319

[Feature Request] Medusa support

## 🚀 Feature Please add Medusa decoding in mlc-llm in C++, we urgently needed it to speedup LLM decoding on mobile device. refers to: https://github.com/FasterDecoding/Medusa/tree/main Medusa adds …

EmilioZhao updated 2 months ago
8
huggingface/text-generation-inference #1977

[Feature]: Additional metrics to enable better autoscaling /…

### Feature request TGI provides some valuable metrics on model performance and load today. However, there are still a number of missing metrics, the absence of which poses a challenge for orchestr…

EandrewJones updated 2 weeks ago
12
open-webui/open-webui #2825

feat: tools (open webui native python function calling)

#798 #2175

tjbck updated 2 months ago
1
langchain-ai/langgraph #1084

ToolMessage `artifact` failed to save in memory

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the [LangGraph](https://langchain-ai.github.io/langgraph/)/LangChain documentation with the integrat…

HaoXuAI updated 1 month ago
6
jmikedupont2/ai-ticket #9

First Version of the request_assistance calls being logged t…

``` { "messages": [ { "role": "system", "content": "You are painter, funny.\n\nYour decisions must always be made independently without seeking user assistance. Play to your str…

jmikedupont2 updated 10 months ago
446
foxcpu/Programming-Language-Trends #2

javascript weekly news

javascript weekly news

foxcpu updated 1 month ago
23
Borketh/hardqoi #4

Optimized decoder for WebAssembly

Feel free to simply close out this issue if you are not interested but we just implemented QOI image format for VNC to deliver lossless remote desktops using Rust WASM clientside here: https://githu…

thelamer updated 1 year ago
8
google/jax #15962

A100 8 GPUs extremely slow compared to a single A100

### Description TL;DR When I run a t5x script using a A100-8 GPU machine it is much slower compared to running the same script on a single A100 machine. There are many available configurations…

KeremTurgutlu updated 9 months ago
4
NVIDIA/TensorRT-LLM #1874

Question regarding the weird operation of GPT-J 6B's XQA on …

Hello TensorRT-LLM experts! I have a question regarding the weird operation of the XQA kernel function supported in NVIDIA's official MLPerf 4.0 version of TensorRT-LLM. First of all, I want to te…

bongwonjang updated 1 week ago
5
irthomasthomas/undecidability #919

microsoft/Everything-of-Thoughts-XoT

- [ ] [Everything-of-Thoughts-XoT/README.md at main · microsoft/Everything-of-Thoughts-XoT](https://github.com/microsoft/Everything-of-Thoughts-XoT/blob/main/README.md?plain=1) # Everything of Thou…

ShellLM updated 1 week ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for efficient-llm

1000+ results
for efficient-llm