inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Project-MONAI/monai-deploy-app-sdk #212

[FEA] Add remote inference support with the use of Triton In…

**Is your feature request related to a problem? Please describe.** App SDK currently supports inference within the application process itself. This is simple and efficient for some use cases, though …

MMelQin updated 2 years ago
7
shanice-l/gdrnpp_bop2022 #124

IndexError: list index out of range [Only when running with …

@wangg12 @shanice-l @Rainbowend @tzsombor95 need your help. The inference script runs successfully without any errors when executed as a standalone Python script. By when running with ros2, ie., …

AjinJayan updated 2 months ago
1
triton-inference-server/server #6998

Inquiry Regarding Triton Inference Server and PyTorch Integr…

Hello. I am writing to inquire about the PyTorch version used in the Triton Inference Server 24.01 release. Upon reviewing the documentation, I noticed that Triton 24.01 includes PyTorch version…

luvpine updated 8 months ago
3
vantage6/vantage6 #1446

[Feature Request] Support multiple nodes from the same organ…

**Problem Description** Many hospitals we work with have multiple servers (e.g., one with GPU for training and another without for inference). Right now, it's not possible to add multiple nodes from…

yannick-vinkesteijn updated 1 month ago
2
huggingface/text-generation-inference #2654

TGI does not support FP8 quantized models on ROCm

### System Info System Info TGI Docker Image: ghcr.io/huggingface/text-generation-inference:sha-11d7af7-rocm MODEL: meta-llama/Llama-3.1-405B-Instruct-FP8 Hardware used: Intel® Xeon® Platinum 8…

Bihan updated 3 weeks ago
5
triton-inference-server/server #7075

Multi-instance TRT model slower than single-instance one. (G…

**Description** I noticed that a model with several instances is slower than with one. I believe that this should not be the case, but throughput and latency indicators say the opposite. **Triton …

decadance-dance updated 1 month ago
3
elastic/kibana #188087

Failing test: X-Pack API Integration Tests.x-pack/test/api_i…

A test failed on a tracked branch ``` Error: Expected status code 200, got 500 with body '{"statusCode":500,"error":"Internal Server Error","message":"[status_exception\n\tCaused by:\n\t\tillegal_arg…

kibanamachine updated 1 month ago
3
fastmachinelearning/SonicCMS #17

Open issues for triton-inference-server (round 2)

Tracking the second round of issues submitted to [triton-inference-server](https://github.com/triton-inference-server/server): - [ ] https://github.com/triton-inference-server/server/issues/2018: Con…

kpedro88 updated 9 months ago
2
jonathan-laurent/AlphaZero.jl #76

Visualizing the timeline of the inference server

In order to profile and optimize the current inference server architecture and best tune its hyper-parameters for various applications, it would be very useful for AlphaZero.jl to have a mode where it…

jonathan-laurent updated 3 years ago
4
vllm-project/vllm #8610

[Misc]: In vllm, I tested that the speed of concurrent serve…

### Anything you want to discuss about vllm. In vllm, I tested that the speed of concurrent server api requests is greater than the speed of offline inference. I would like to ask if there are any pe…

lwdnxu updated 1 month ago
1

上一页 1...21 22 23 24 25 26 27...100 下一页

1000+ results for inference-server

1000+ results
for inference-server