-
By using this model from Intel :
https://docs.openvino.ai/2024/omz_models_model_age_gender_recognition_retail_0013.html
I can't get good results (Or this model offers really good accuracy in the …
-
I know that 2.18+ supports pytorch and we want to use that.
jetson nano have jetpack 4.6.4 latest version. can this version install triton 2.20
we need pytorch and python backend supports
-
### **Problem:**
When using model-analyzer with --triton-launch-mode=remoted, I encounter connectivity issues.
### **Context:**
I have successfully started Triton Inference Server on the same ser…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and f…
-
### System Info
- CPU Architecture x86_64
- GPU - A100-80GB
- CUDA version - 11
- Tensorrt LLM version : 0.9.0
- Triton server version - 2.46.0
- model : Llama3-7b
### Who can help?
_No respo…
-
Hi guys
In the repo tensorrt_backend's code:
https://github.com/triton-inference-server/tensorrt_backend/blob/main/src/instance_state.h#L377
It means: If the model uses dynamic shape, then the CUDA …
-
I would like to use this as a python backend within `triton-inference-server` in order to allow for bringing my production parameters in better alignment with training / validation.
Are there plans…
-
### System Info
Docker image: nvcr.io/nvidia/tritonserver:24.07-trtllm-python-py3
Device: 8x H100
trt-llm backend: v0.11.0
### Who can help?
@byshiue @schetlur-nv
### Information
- [ ] The off…
-
## Description
When requesting token metrics from an endpoint running a LMI container using a vLLM engine, **non-zero** values are returned for tokenThroughput, totalTokens, and tokenPerRequest (**as…
-
Seeing failures in sm-python-sdk tests, related to either no credentials, or no Docker daemon.
* - `botocore.exceptions.NoCredentialsError: Unable to locate credentials`
* - FileNotFoundError: [Er…