-
### System Info
- CPU architecture x86_64
- Nvidia H100 GPU
- docker image `nvcr.io/nvidia/tritonserver:24.02-trtllm-python-py3`
- TensorRT-LLM tag v0.9.0
- tensorrtllm_backend tag v0.9.0
- Ubu…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
_No …
-
**Docker: nvcr.io/nvidia/pytorch:24.06-py**3
pip uninstall nvidia-modelopt
pip install nvidia-modelopt==0.13.0
**command:**
python demo_txt2img_xl.py "enchanted winter forest, soft diffuse li…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
…
-
## Bug Description
When doing Post-training quantization using the INT8 calibration API, the model export works fine when using the `ptq.DataLoaderCalibrator` but there is a runtime error when loa…
-
After successful quantizing and exporting ONNX models for ResNet18, using 2 different mode `int8` and `fp8`, I am trying to export these ONNX models to TRT, but no luck so far. It returns Error No sup…
-
## Description
## Environment
**TensorRT Version**:8.6
**NVIDIA GPU**:A10
**NVIDIA Driver Version**:525.147.05
**CUDA Version**:12.0
**CUDNN Version**:8.9
Operating Sy…
-
### 问题确认 Search before asking
- [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.
### 请提出你的问题 Please ask your question
solov2怎样进行模型压缩,或者tensorrt加速?
参照paddleslim里面的…
-
说明:这里是使用的paddleDetection版本是 release 2.0。paddlepaddle:2.0.0。
这里下载好官方的已经训练好的模型参数yolov3_mobilenet_v3_large_270e_coco.pdparams。
使用export_model.py 导出模型:python tools/export_model.py -c configs/yolov3/yolo…
-
I have `INT8` quantized a `BERT` model for binary text classification and am only getting a marginal improvement in speed over `FP16`.
I am using the `transformer-deploy` library that utilizes Tens…