关于benchmark.py - Githubissues

kikyou2018 commented 6 months ago

请问benchmark.py计算计算量的时候是不是没有把矩阵运算算进去？为什么我更改了平均池化和双线性插值的窗口大小后，benchmark.py显示的计算量没有发生变化？

caojiaolong commented 6 months ago

您好，感谢您对我们项目的关注！

抱歉因为前些天的出差导致回复迟了。根据该知乎专栏的讨论：

目前在github上计算模型的FLOPs、Parameters的相关工具有calflops、ptflops、thop、torchstat、 fvcore等等。本文通过将这些工具全部尝试对比一遍，先从模型计算FLOPs结果的准确性上发现这里面存在将FLOPs与MACs混用甚至弄反了的情况(torchstat)，没有区分FLOPs与MACs、输出是MACs(fvcore, thop, ptflops)的情况。其中只有calflops计算的结果是严格区分FLOPs和MACs。

我们得知之前用于测试FLOPs的库thop存在将MACs与FLOPs混用的情况，而且部分矩阵运算没有计算进去，因此我们更新了我们测试FLOPs的脚本，换用calflops库进行计算（可以使用pip install calflops进行安装，如果提示缺失transformers库可以使用pip install transformers安装），得到了较为正确的FLOPs数值，将其更新如下：

模型/FLOPs	CMX(MiT-B2)	CMX(MiT-B4)	CMX(MiT-B5)	DFormer-T	DFormer-S	DFormer-B	DFormer-L
NYU Depth v2 (480*640) GFLOPs	268.5870	537.8280	672.1770	23.5853	51.4670	84.1381	131.7630
SUN RGBD (530*730) GFLOPs	344.8680	695.0630	870.5220	30.3197	66.2621	108.5770	169.6570

我们计算FLOPs的脚本如下，只需要将其保存为benchmark.py文件并放置在项目目录，使用python直接运行即可：

from importlib import import_module
from models.builder import EncoderDecoder as segmodel
import torch.nn as nn
from calflops import calculate_flops
import torch

test_model = "DFormer_Large"
config = getattr(import_module(f"local_configs.NYUDepthv2.{test_model}"), "C")
config.pretrained_model = None
criterion = nn.CrossEntropyLoss(reduction="none", ignore_index=config.background)
BatchNorm2d = nn.SyncBatchNorm
model = segmodel(
    cfg=config,
    criterion=criterion,
    norm_layer=BatchNorm2d,
    syncbn=True,
).cuda()
batch_size = 1
input_shape = (batch_size, 3, 480, 640)
input = torch.randn(input_shape).cuda()
input_x = torch.randn(input_shape).cuda()
flops, macs, params = calculate_flops(
    model=model,
    args=[input, input_x],
    print_results=False,
    output_precision=4,
)
print("FLOPs:%s   MACs:%s   Params:%s \n" % (flops, macs, params))

kikyou2018 commented 6 months ago

感谢回复

VCIP-RGBD / DFormer

关于benchmark.py #20