inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel-analytics/ipex-llm #8998

Optimize the model used in Ant group for inference

There is a BERT based model used in Ant group for inference on geo similarity compare. https://modelscope.cn/models/damo/mgeo_geographic_entity_alignment_chinese_base/summary https://modelscope.cn/m…

qzheng527 updated 1 year ago
3
triton-inference-server/server #7007

Triton ensemble pipeline high CPU usage

**Description** I have a 5 steps ensemble pipeline for triton. * 3 steps are torchscript artifacts * 2 steps are tensorrt compiled models in pbtxts files I have ``` instance_group [{ kind: KIN…

sergeevii123 updated 7 months ago
2
openvinotoolkit/openvino_notebooks #2442

mLLAMA 3.2 Output Issue and Gradio Error on Intel SPR Machin…

I'm running mLLAMA 3.2 on two machines: Intel 4th Gen SPR server NVIDIA GPU machine On the NVIDIA GPU machine, everything works fine. But on the Intel SPR machine, the output is strange (!!!!!!!!…

saranyabalakumar updated 2 weeks ago
4
triton-inference-server/server #7222

Unable to use pytoch library with libtorch backend when usin…

**Description** A clear and concise description of what the bug is. I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…

sivanantha321 updated 5 months ago
10
milvus-io/milvus #33124

[Bug]: datacoord list_index error

### Is there an existing issue for this? - [X] I have searched the existing issues ### Environment ```markdown - Milvus version:2.4.1 - Deployment mode(standalone or cluster): cluster - MQ type(r…

JamesBonddu updated 4 days ago
7
NVIDIA/TensorRT-LLM #2371

How to integrate Multi-LoRA Setup at Inference with NVIDIA T…

I built the engine, and had two separate LoRA layers with the base llama3.1 model. The output from the build is rank0.engine, config.json, and then a lora folder with the following structure: lora | |…

JoJoLev updated 1 week ago
9
Ironclad/rivet #390

[Feature]: Start debugging server from UI

### Feature Request [As documented](https://rivet.ironcladapp.com/docs/api-reference/remote-debugging), it is possible to start a server to use remote rivet projects, but the current method require…

okdewit updated 7 months ago
1
djmango/obsidian-transcription #16

Add support for OpenAI API

The OpenAI API now supports inference with Whisper. I think it would be good if you add the option to use that service instead of only the web server. That way you don't have to set up any server what…

johannesCmayer updated 1 year ago
1
triton-inference-server/server #7236

Cant build python+onnx+ternsorrtllm backends r24.04

Im trying https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/compose.md to build onnx+python+tensorrtllm backends. 1) as mention in doc i do ```bash git clone …

gulldan updated 6 months ago
3
BerriAI/litellm #5853

[Bug]: Proxy: Constant "Provider NOT provided" errors in log…

### What happened? In the proxy admin UI (v1.44.23 stable), I added an invalid model by mistake*, and now I'm getting constant error messages in the logs with no way I can see to stop them. The er…

liffiton updated 1 month ago
7

上一页 1...72 73 74 75 76 77 78...100 下一页

1000+ results for inference-server

1000+ results
for inference-server