-
There is a BERT based model used in Ant group for inference on geo similarity compare.
https://modelscope.cn/models/damo/mgeo_geographic_entity_alignment_chinese_base/summary
https://modelscope.cn/m…
-
**Description**
I have a 5 steps ensemble pipeline for triton.
* 3 steps are torchscript artifacts
* 2 steps are tensorrt compiled models
in pbtxts files I have
```
instance_group [{ kind: KIN…
-
I'm running mLLAMA 3.2 on two machines:
Intel 4th Gen SPR server
NVIDIA GPU machine
On the NVIDIA GPU machine, everything works fine. But on the Intel SPR machine, the output is strange (!!!!!!!!…
-
**Description**
A clear and concise description of what the bug is.
I am trying to use the newly introduced [triton inference server In-Process python API](https://github.com/triton-inference-server…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Environment
```markdown
- Milvus version:2.4.1
- Deployment mode(standalone or cluster): cluster
- MQ type(r…
-
I built the engine, and had two separate LoRA layers with the base llama3.1 model. The output from the build is rank0.engine, config.json, and then a lora folder with the following structure:
lora
|
|…
-
### Feature Request
[As documented](https://rivet.ironcladapp.com/docs/api-reference/remote-debugging), it is possible to start a server to use remote rivet projects, but the current method require…
-
The OpenAI API now supports inference with Whisper. I think it would be good if you add the option to use that service instead of only the web server. That way you don't have to set up any server what…
-
Im trying https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/compose.md
to build onnx+python+tensorrtllm backends.
1)
as mention in doc i do
```bash
git clone …
-
### What happened?
In the proxy admin UI (v1.44.23 stable), I added an invalid model by mistake*, and now I'm getting constant error messages in the logs with no way I can see to stop them.
The er…