dingfengshi / TriDet

[CVPR2023] Code for the paper, TriDet: Temporal Action Detection with Relative Boundary Modeling
MIT License
160 stars 13 forks source link

How to measure GMACs #15

Closed shiyi-z closed 1 year ago

shiyi-z commented 1 year ago

Hello, sorry to bother you! can you share the method to measure the GMACs and latency as depict in table 6, I can't find that in the code thanks very much!

dingfengshi commented 1 year ago

Hi!We use fvcore to count the GMACs. It is notable that this code count one fused multiply-add as one flop, so the "flops" is actually the "MAC".

shiyi-z commented 1 year ago

thanks for your reply! I have known how to use fvcore: flops = FlopCountAnalysis(model, input) I would like to know what should be the tensor dimension of the input in the above function in this model

dingfengshi commented 1 year ago

thanks for your reply! I have known how to use fvcore: flops = FlopCountAnalysis(model, input) I would like to know what should be the tensor dimension of the input in the above function in this model

Hi, on THUMOS14, we use the same input as actionformer. Specifically, each video feature sequence is padded to 2304 in temporal dimension, and the dimensions of I3D features are 2048, so the shape of a video input should be (1, 2048, 2304)

shiyi-z commented 1 year ago

thanks very much for your answer , but,why do I get the following error reported after using the dimension you described? image image would you have some suggestions for modifications?

dingfengshi commented 1 year ago

thanks very much for your answer , but,why do I get the following error reported after using the dimension you described? image image would you have some suggestions for modifications?

The input for the model is a list of dict which contain the video information. You can try measure each components in forward function in meta_archs.py.