Issue with MaskRCNN Model Conversion to TensorRT at VGA Resolution (640x480)

Description

I am encountering issues when converting a MaskRCNN model, trained using Detectron 2, to TensorRT for VGA resolution (640x480). I have followed the standard conversion process as outlined in the ./samples/python/detectron2 directory but made modifications to the augmentation settings to suit the VGA resolution.

Steps to Reproduce

Trained the MaskRCNN model using Detectron 2 for a resolution of 640x480.

Modified the augmentation settings in the conversion script from:

aug = T.ResizeShortestEdge(
   [1344, 1344], 1344
)

aug = T.ResizeShortestEdge(
   [480, 480], 640
)

Converted the model to ONNX format (the ONNX graph can be seen here).
Converted the ONNX model to a TensorRT engine.

Expected Behavior

I expected the converted TensorRT model to maintain a similar level of accuracy and detection capability as the original Detectron 2 model.

Observed Behavior

After conversion to TensorRT:

The accuracy of the model is significantly reduced.
Most images have no detections.
Some detections are incorrect or misplaced.

Environment

TensorRT Version: 8.6.1.6 NVIDIA GPU: A5000 NVIDIA Driver Version: 525.85.12 CUDA Version: 12.0 CUDNN Version: 8.9.0 Operating System: ubuntu20.04 Python Version (if applicable): 3.8.13 PyTorch Version (if applicable): 2.1

Questions and Requests for Help

Has anyone successfully converted a MaskRCNN model to TensorRT for VGA resolution (640x480) without significant loss of accuracy?
What could be going wrong in my conversion process?
Any guidance or suggestions to improve the accuracy of the TensorRT model would be greatly appreciated.

Thank you in advance for any help or insights provided.

NVIDIA / TensorRT