Can TensorRT calculate the number of Params and FLOPs for the model?

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

https://developer.nvidia.com/tensorrt

Apache License 2.0

10.73k stars 2.12k forks source link

Can TensorRT calculate the number of Params and FLOPs for the model? #4219

Open demuxin opened 6 days ago

demuxin commented 6 days ago

Description

I want to measure the performance of the model, so I want to know the number of parameters and FLOPs.

Is there any tool that can calculate the flops and params of the TensorRT engine?

lix19937 commented 5 days ago

Ref https://github.com/NVIDIA/TensorRT/issues/517#issuecomment-2433999764

BTW, FLOPS should not change going from TF/Torch to TRT (assuming network does not have redundant branches which don’t contribute to the network outputs). Note, if TRT actively uses horizontal and vertical fusion of different layers, so final model would be computational cheaper, than model which you initialized.

demuxin commented 5 days ago

if TRT actively uses horizontal and vertical fusion of different layers, so final model would be computational cheaper, than model which you initialized.

Doesn't that statement imply that the FLOPS of the model was changed? I just want to know the final FLOPS.

lix19937 commented 5 days ago

You can dump the EngineInspector output by trtexec --profilingVerbosity=detailed , it shows the weight/bias size for each layer, so you can sum them up by script.

demuxin commented 4 days ago

Hi @lix19937 , Can you provide specific commands? I'm not very good at trtexec, thanks a lot.

lix19937 commented 3 days ago

Use trtexec --onnx=spec --dumpLayerInfo --profilingVerbosity=detailed --exportLayerInfo=layerinfo.json