HorizonRobotics / Sparse4D

MIT License
326 stars 31 forks source link

Deployment to TensorRT #98

Open CMSC740Student opened 1 month ago

CMSC740Student commented 1 month ago

Hi All,

Thank you for the amazing work! Can this model be exported to TensorRT for inference?

Thanks

shubhendu-ranadive commented 1 month ago

The complete model maybe difficult to deploy to TensorRT due to Deformable Aggregation Function. But I think, parts of the model can be deployed on TensorRT. Like the Resnet50 Backbone or the FPN Neck.

You can try to use torch2trt from NVIDIA for that purpose.

CMSC740Student commented 1 month ago

Thank you for your response! What about converting it to ONNX? What is the recommended route to do this? It it via torch.onnx.export or to use mmdeploy?

shubhendu-ranadive commented 1 month ago

I haven't tried to convert the model to ONNX yet. So far I have only been successful to convert the Backbone, Neck and the Encoders to TensorRT using torch2trt without causing massive change in accuracy. So, I'm not certain which is the best way to approach for ONNX. I think MMDeploy would be great since you can define custom plugins to convert Deformable Aggregation Function (Haven't tried so not sure). Maybe you can try that and let me know how it goes?

CMSC740Student commented 1 month ago

@shubhendu-ranadive Thanks for the suggestions!

I am able to convert the model to onnx but the model outputs same results for different inputs, so something is wrong.

Here are the steps I performed

  1. pip3 install onnx
  2. Update forward definition for sparse4d.py, sparse4d_head.py, instance_bank.py, blocks.py & detection3d_blocks.py
    
    # Change this:
    def forward(self, img, **data):

To this

def forward(self, img, timestamp=None, projection_mat=None, image_wh=None):


3. Use torch.onnx.export

with torch.no_grad(): model.eval() torch.onnx.export( model, args, output_path, export_params=True, input_names=input_names, output_names=output_names, opset_version=opset_version, dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs, verbose=verbose)



 The issue right now is that the model outputs same results for different inputs. Debugging in progress...
shubhendu-ranadive commented 1 month ago

@CMSC740Student That's great! 👍 Can I ask what's the value set for keep_initializers_as_inputs ? If you haven't already tried, maybe setting it to False would change the results?

CMSC740Student commented 1 month ago

@shubhendu-ranadive Did some debugging...I think the issue may be with InstanceBank.

InstanceBank caches intermediate results from previous inputs & passes them along with the next set of inputs.

Therefore, that InstanceBank class contains logic that only gets triggered when it processes sequential batches of inputs, as opposed to a single batch.

When I try to export my model with dummy inputs with a single batch, the outputs are incorrect (likely because the InstanceBank logic is not traced/exported correctly via torch.onnx.export)

Do you know if its possible to pass sequential batches of inputs so that the model is traced correctly with all inputs?

shubhendu-ranadive commented 1 month ago

@CMSC740Student Thanks for your reply. That indeed looks like a problem when creating a graph for ONNX. I don't think torch.onnx.export allows sequential inputs.

The only thing I found after searching is maybe using torch.jit.script? This will try to trace dynamic control flow of the model which you can then try to convert to ONNX?

Edit: more on using torch.jit.script here

shubhendu-ranadive commented 1 month ago

@CMSC740Student Did you get it working?

PonyAIjkz commented 2 weeks ago

@CMSC740Student @shubhendu-ranadive This repository may be useful for your deployment:https://github.com/ThomasVonWu/SparseEnd2End