Additional changes are required to address numerical stability issues within the Flux-Transformer model.
3) TensorRT Exporter
Added the flux/trt/exporter package, containing code to export PyTorch models to ONNX and build TensorRT engines.
4) TensorRT Engine Execution
flux/trt/engine package is responsible to execute inference using TRT
5) TensorRT Mixin Classes
Added the flux/trt/mixin package with mixin classes to share parameters between the model building and inference phases.
6) TensorRT Manager
Introduced flux/trt/trt_manager.py as the main TensorRT management class. It handles the conversion of PyTorch models to TensorRT engines and manages the TensorRT context for inference.
This pull request aims to add support for TensortRT integration
Summary of Changes
1) CLI Support for TensorRT
TRT_ENGINE_DIR
specifies the directory for storing TensorRT enginesONNX_DIR
specifies the directory for ONNX model exports.bf16
(fp16
andfp8
coming soon)3) TensorRT Exporter