inference-server Search Results

triton-inference-server/tensorrtllm_backend #573

Inference server stalling

### System Info - tensorrtllm_backend built using Dockerfile.trt_llm_backend - main branch tesnorrt llm (0.13.0.dev20240813000) - 8xH100 SXM - Driver Version: 535.129.03 - CUDA Version: 12.5 …

siddhatiwari updated 1 week ago

containers/podman-desktop-extension-ai-lab #1691

Redesigned inference server details mockup

### Is your enhancement related to a problem? Please describe See ### Describe the solution you'd like A mockup for the redesigned UI ### Describe alternatives you've considered _No response_ #…

jeffmaury updated 5 days ago

AxisCommunications/acap-computer-vision-sdk-examples #190

Spike in inference time from the inference server

> **Please do not disclose security vulnerabilities as issues. See our [security policy](../../SECURITY.md) for responsible disclosures.** ### I have trained yolov5m model and sucessfully deployed …

akash-syook updated 2 days ago

containers/podman-desktop-extension-ai-lab #1592

Adjust Inference Server details page UX

### Is your enhancement related to a problem? Please describe While the inference server page is listing the information, those are not easy to decipher. And we would like to introduce more sections …

slemeur updated 2 weeks ago

limcheekin/open-text-embeddings #18

Triton inference server model

Hi Can we use this with Triton inference server model?

riyaj8888 updated 2 months ago

triton-inference-server/server #7494

High latency with Triton Inference Server

Description of problem: I did some experiments to measure timing performance to compare standalone inference based on a TensorRT model vs Triton serving the TensorRT model using identical input on a …

adrtsang updated 4 weeks ago

meta-llama/llama-stack #45

Inference Failed Because of '500 Internal Server Error'

After launching the distribution server by `"llama distribution start --name local-llama-8b --port 5000 --disable-ipv6 "`, running any inference example, for example `"python examples/scripts/vacatio…

dawenxi-007 updated 1 month ago

aws/deep-learning-containers #4047

[question] triton inference server Dockerfile

hi, where can i find documentation how to build triton inference server trt-llm 24.06 for sagemaker myself so i can run it on sagemaker? Nvidia Image i want to use: nvcr.io/nvidia/tritonserver:2…

geraldstanje updated 2 months ago

jozu-ai/kitops #357

PoC: Inference support using llamacpp-server

### Describe the problem you're trying to solve Proof of Concept (PoC) a generic inference container that uses Triton as the inference engine and can download and utilize a ModelKit as efficiently as …

gorkem updated 2 weeks ago

containers/podman-desktop-extension-ai-lab-playground-images #57

rename repository to better reflect the purpose

### Description We are using the images build in this repository as Inference Server images in [AI Lab](https://github.com/containers/podman-desktop-extension-ai-lab) repository. https://github.…

axel7083 updated 4 days ago

1000+ results for inference-server

1000+ results
for inference-server