NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.87k stars 2.14k forks source link

Can TensorRT calculate the number of Params and FLOPs for the model? #4219

Open demuxin opened 1 month ago

demuxin commented 1 month ago

Description

I want to measure the performance of the model, so I want to know the number of parameters and FLOPs.

Is there any tool that can calculate the flops and params of the TensorRT engine?

lix19937 commented 1 month ago

Ref https://github.com/NVIDIA/TensorRT/issues/517#issuecomment-2433999764

BTW, FLOPS should not change going from TF/Torch to TRT (assuming network does not have redundant branches which don’t contribute to the network outputs). Note, if TRT actively uses horizontal and vertical fusion of different layers, so final model would be computational cheaper, than model which you initialized.

demuxin commented 1 month ago

if TRT actively uses horizontal and vertical fusion of different layers, so final model would be computational cheaper, than model which you initialized.

Doesn't that statement imply that the FLOPS of the model was changed? I just want to know the final FLOPS.

lix19937 commented 1 month ago

You can dump the EngineInspector output by trtexec --profilingVerbosity=detailed , it shows the weight/bias size for each layer, so you can sum them up by script.

demuxin commented 1 month ago

Hi @lix19937 , Can you provide specific commands? I'm not very good at trtexec, thanks a lot.

lix19937 commented 1 month ago

Use trtexec --onnx=spec --dumpLayerInfo --profilingVerbosity=detailed --exportLayerInfo=layerinfo.json

demuxin commented 3 weeks ago

This is the output file. There are no weight/bias size in the file.

layerinfo.json

lix19937 commented 2 weeks ago

like follow

  "Weights": {"Type": "Float", "Count": 18432},
  "Bias": {"Type": "Float", "Count": 64},
demuxin commented 2 weeks ago

Thank you for your useful help! the method can calculate the number of params.

But calculating FLOPs feels too complicated, different layers are calculated differently, is there any easy way?