-
### Your current environment
```text
The output of `python collect_env.py`
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch…
-
In the demo,
it accelerate vae model by
# Accelerating VAE with TensorRT
trtexec --onnx=vae.onnx --saveEngine=vae.plan --minShapes=latent_sample:1x4x64x64 --optShapes=latent_sample:4x4x64x64 --m…
-
### Description
```shell
按照教程跑CUDA_VISIBLE_DEVICES=1,2 mpirun -n 1 --allow-run-as-root /opt/tritonserver/bin/tritonserver --model-repository=${WORKSPACE}/all_models/bert/ 命令时报错:
E0412 06:53:22.3687…
-
**Description**
The model repo is an object detection ensemble, which consists of a preprocessor written with the Python backend, and the main model in TensorRT plan. The Python backend uses CuPy t…
-
I'd like to add the triton machine driver to Rancher (rancher.com).
Docs: http://rancher.com/docs/rancher/v1.3/en/configuration/machine-drivers/
Where can I find the "machine driver binary 64-bit…
-
**Description**
Istio is deprecated on GKE. You cannot create a cluster with Istio add-on, and as such you cannot use the one-click deployer.
**Triton Information**
Not applicable
GKE version 1.…
-
Return leaf index of each tree, just like original lightgbm c api param predict_type=C_API_PREDICT_LEAF_INDEX.
-
Hi, I run the centerface onnx model and find although faces are detected the bbox size are quite different between onnx model and tensorrt model.
![out](https://user-images.githubusercontent.com/3515…
-
We are using cloudevents/sdk-go (v1.2.0) to forward payloads to our request logger.
Triton server recently (sometime between 20.08 and 21.08) started to respect `Accept-Encoding` header and now re…
-
Hello, I have had a trouble that quantized model with AWQ has performance degradation more than expected.
I know that ModelOpt provides optimized kernels and quantization algorithms for fast quanti…