The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
When I use calculate_flops to calculate flops of a local model (e.g. openai/clip-vit-large-patch14-336 downloaded locally), the result is smaller than the FLOPs calculated manually (use the flops calculation from mmcv and add the self-attention flops).
I found that the modeling_clip.py in the huggingface transformers package use torch.bmm to calculate attention weights, but the torch.mm and torch.bmm are commented in _patch_tensor_methods in pytorch_ops.py. When I uncomment these lines, the result is consistent with my manual calculation.
Are there any reasons why they are deleted? From commit 4bcf54c3f846d30fd483f40e3b39219f2454e801 it seems to fix the bug for calculate_flops_hf, but I suspect this introduces bugs to calculate_flops.
When I use
calculate_flops
to calculate flops of a local model (e.g.openai/clip-vit-large-patch14-336
downloaded locally), the result is smaller than the FLOPs calculated manually (use the flops calculation from mmcv and add the self-attention flops).I found that the
modeling_clip.py
in the huggingface transformers package usetorch.bmm
to calculate attention weights, but thetorch.mm
andtorch.bmm
are commented in_patch_tensor_methods
inpytorch_ops.py
. When I uncomment these lines, the result is consistent with my manual calculation.Are there any reasons why they are deleted? From commit 4bcf54c3f846d30fd483f40e3b39219f2454e801 it seems to fix the bug for
calculate_flops_hf
, but I suspect this introduces bugs tocalculate_flops
.Maybe related to #26