MrYxJ / calculate-flops.pytorch

The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
https://pypi.org/project/calflops/
MIT License
393 stars 14 forks source link

Support sparse model calculation #4

Closed pingzhili closed 9 months ago

pingzhili commented 9 months ago
  1. Add the is_sparse argument, default to False, for calculate_flops and get_module_flops functions
  2. If is_sparse is set to True, the num_parameters, FLOPs, and MACs will only include those nonzero weights.
MrYxJ commented 9 months ago

Pingzhi, thank you for your advice. I was a little busy recently, so I didn't reply to your message in time. Since I have not calculated is_sparse scenarios, it may be a good suggestion for some users, and thank you again for helping me standardize writing of the code.

By the way, if you're interested in improving the tool (honestly, there are a few litter bugs) and I also have some new ideals to perfect this tool to help more people calculate model flops easiler, you‘re welcome to concact me by contact me by email: yxj2017@gmail.com, and we can work on it together.

pingzhili commented 9 months ago

Many thanks for your kind offering! Could you raise some issues about these little bugs, we can work on them together in the future :)

MrYxJ commented 9 months ago

At present, bugs are mainly reflected in that only a part of huggingface models can be covered by calculating meta model without downloading the model. Some LLMS do not support the calculation of flops of meta model due to the use of the latest technology. Some of this can be fixed by installing the required packages (such as upgrading the transfomers), while others require more complex ways to support them.

In addition, we can continue to iteratively develop new features, and now we can count the number of parameters and FLOPs, so we can estimate the memory usage of the LLM and more realistic estimates of GPU and training time required.