tensorrt-llm Search Results

1000+ results
for tensorrt-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

CentML/flexible-inference-bench #54

TensorRT-LLM Support

In general, we don't have a very good idea of how well we support TRT LLM. It would be great to see how good this is and if we need to fix anything.

andoorve updated 4 months ago
1
NVIDIA/TensorRT-LLM #2118

AttributeError: 'PluginConfig' object has no attribute '_str…

### System Info - CPU: X86 - GPU: NVIDIA L20 - python - tensorrt 10.3.0 - tensorrt-cu12 10.3.0 - tensorrt-cu12-bindings 10.3.0 - tensorrt-cu12-libs 10…

BooHwang updated 11 hours ago
6
dottxt-ai/outlines #632

Support for TensorRT-LLM

Outlines currently support the vLLM inference engine, it would be great if it could also support the tensorRT-LLM inference engine.

SupreethRao99 updated 2 months ago
7
unslothai/unsloth #1309

Does tensorRT-LLM support serving 4bit quantised unsloth Lla…

We want to deploy https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit which is 4-bit quantized version of llama-3.2-1B model. It is quantized using bitsandbytes. Can we deploy this using ten…

jayakommuru updated 1 week ago
1
NVIDIA/TensorRT-LLM #2472

Does tensorRT-LLM support serving 4bit quantised unsloth Lla…

We want to deploy https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit which is 4-bit quantized version of llama-3.2-1B model. It is quantized using bitsandbytes. Can we deploy this using ten…

jayakommuru updated 2 weeks ago
2
janhq/models #27

feat: llama3.1 (TensorRT-LLM)

## Goal - [ ] Support llama3.1 in main TensorRT-LLM engine formats - [ ] Upload to HF ## User Requests

0xSage updated 1 month ago
1
huggingface/optimum-benchmark #259

TensorRT-LLM pipeline parallelism is broken

## Problem Description When trying to use pipeline parallelism in tensorrt-llm on 2+ NVIDIA GPUs, I encounter ```AssertionError: Expected but not provided tensors:{'transformer.vocab_embedding.weig…

asesorov updated 2 months ago
12
NVIDIA/TensorRT-LLM #2338

Whisper Encoder issues with Executor API

Hello, `0.15.0.dev2024101500` added a new issue when using the executor API with whisper ``` [TensorRT-LLM][ERROR] IExecutionContext::inferShapes: Error Code 7: Internal Error (WhisperEncoder/__add_…

MahmoudAshraf97 updated 1 week ago
6
triton-inference-server/tensorrtllm_backend #577

Unable to launch triton server with TP

### System Info Built tensorrtllm_backend from source using dockerfile/Dockerfile.trt_llm_backend tensorrt_llm 0.13.0.dev2024081300 tritonserver 2.48.0 triton image: 24.07 Cuda 12.5 ### Wh…

dhruvmullick updated 1 month ago
4
dusty-nv/jetson-containers #605

Request for Docker images of tensorrt_llm

nmq45698 updated 2 weeks ago
11

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for tensorrt-llm

1000+ results
for tensorrt-llm