inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

SeungjunNah/DeepDeblur-PyTorch #54

Inference error

1. I used this command for inference but encountered issue. Anyone knows how to fix this? - command: `python launch.py --n_GPUs 1 main.py --batch_size 8 --precision single` - error : `[W socke…

davidvct updated 5 months ago
3
InternLM/lmdeploy #2070

[Feature] api server部署方式下的logprob功能

### Motivation 你好。我看到文档中支持offline inference模式下，得到input logprob。请问api server部署方式下支持吗？如果不支持，请问近期会有plan吗？ ### Related resources #2041 ### Additional context _No response_

cjfcsjt updated 2 months ago
1
h2oai/h2o-3 #8866

Hourly Inference Server Setup Instructions are Hard to Follo…

Notes from following the Scoring Server on AWS to set up a AMI Rest Server from an H2O-3 MOJO. The documentation version of [http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html?highl…

exalate-issue-sync[bot] updated 1 year ago
1
huggingface/tgi-gaudi #166

low throughput while using TGI-Gaudi on bigcode/starcoderbas…

### System Info tgi-gaudi docker container built from master branch (4fe871ffaaa62f1a203607078e868fcca962b017) Ubuntu 22.04.3 LTS Gaudi2 HL-SMI Version: hl-1.15.0-fw-48.2.1.1 Driver Version: 1…

vishnumadhu365 updated 2 months ago
1
GoogleCloudPlatform/ai-on-gke #649

RAG Application - release-1.1 - Failing running Terraform

Terraform apply fails. When runnint terraform apply, it fails deploying Kubernetes. Used Branch: release-1.1 Logs: ```` module.inference-server.kubernetes_deployment.inference_deploymen…

vmasilva updated 4 months ago
1
chatchat-space/Langchain-Chatchat #4610

chatchat.init_database:worker:61 - 向量库 samples 加载失败。

你好，我先进行chatchat init，然后运行chatchat kb -r 出现以下错误： 2024-07-25 18:06:36.050 | INFO | chatchat.server.knowledge_base.kb_cache.faiss_cache:load_vector_store:109 - loading vector store in 'samples/vec…

sevenandseven updated 6 days ago
10
triton-inference-server/server #7394

Triton inference is slower than tensorRT

**Description** Im using a simple client inference class base on client example. My tensorRT inference with batchsize 10 with 150ms and my triton with tensorRT backend took 1100ms. This is my client:…

namogg updated 1 month ago
2
bytedance/lightseq #414

Does LightSeq support ONNX export and Triton Inference Serve…

Hi team, QQ: does `lightseq` support the followings, - Convert HuggingFace BERT/RoBERTa models to `int8` precision directly - If yes, can the converted model be exported to ONNX format directly? - …

stevezheng23 updated 1 year ago
1
ultralytics/ultralytics #16547

How to maintain privacy of custom trained model and at the s…

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

snehakap updated 11 hours ago
3
sgl-project/sglang #35

Triton support

Hello, curious if we can already use sglang as a backend for NVIDIA's Triton Server. Amazing work with the library btw, love it!

TheodoreGalanos updated 2 weeks ago
7

上一页 1...22 23 24 25 26 27 28...100 下一页

1000+ results for inference-server

1000+ results
for inference-server