inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ggerganov/whisper.cpp #1843

whisper with language set when other language spoken it r…

whisper with language set when other language spoken it return translated to the language in option how to let whisper to not translate and just return blank or ignore the request ? Thanks

UniversalTechno updated 2 months ago
4
Plachtaa/VITS-fast-fine-tuning #259

加载vits 模型，使用interfere.exe 出现 size mismatch

Traceback (most recent call last): File "inference.py", line 97, in File "utils.py", line 45, in load_checkpoint File "torch\nn\modules\module.py", line 1672, in load_state_dict self.__…

qraccess updated 9 months ago
9
catboost/catboost #2694

C API thread_count parameter missing

Problem: We're trying to evaluate C API vs python for online inference. And C API does not have an analogous [thread count](https://github.com/catboost/catboost/blob/master/catboost/python-package/cat…

alexeysofin updated 3 months ago
3
triton-inference-server/tensorrtllm_backend #388

SAFETENSORS and OpenAI style endpoint

### System Info I have searched the repo here and the main server repo but don't see any information on either a) support for Safetensors (many models are saved that way on HF) and also b) whether th…

RonanKMcGovern updated 2 months ago
5
lastmile-ai/aiconfig #1154

Gradio cookbook: `ffmpeg missing` error

When running Gradio cookbook, I am running into this error when trying to execute the very last prompt in the cookbook. Error message shown in the editor: `Error Exception: ffmpeg was not found b…

sp6370 updated 6 months ago
2
triton-inference-server/onnxruntime_backend #133

Allow Usage of Intel oneDNN EP For ONNX Backend

**Is your feature request related to a problem? Please describe.** I would like to use the Intel oneDNN Execution Provider (EP) in ONNX Runtime built for Triton Inference Server ONNX Backend. **De…

narolski updated 2 years ago
2
triton-inference-server/server #6711

Improve Error Reporting for load_model in Triton's Explicit …

**Is your feature request related to a problem? Please describe.** Currently, when Triton Inference Server is running in `--model-control-mode=explicit` and a `load_model` request is sent from the cl…

teith updated 9 months ago
3
instructlab/instructlab #1204

`ilab generate` can not utilize nvidia cuda and very slow.

**Describe the bug** I have folllowed the instructions of CUDA installation. And I have CUDA install on my container also. But when I tried to run `ilab generate`, it complains. ``` (venv) …

hugulas updated 1 month ago
7
triton-inference-server/server #7402

Generating patches of image on server + Dynamic Batching

**Description** I use a model ensemble with 3 models: pre-processor, inference model and post-processor. I want to send one image to the server and generate **n** patches of the given image in the pr…

j-sheikh updated 2 months ago
1
jossalgon/StableDiffusionTelegram #4

debian server

i installed pytorch for cpu on debian server got error: Pipelines loaded with `torch_dtype=torch.float16` cannot run with `cpu` or `mps` device. It is not recommended to move them to `cpu` or `mps` …

rafalohaki updated 1 year ago
1

上一页 1...73 74 75 76 77 78 79...100 下一页

1000+ results for inference-server

1000+ results
for inference-server