DREAMXFAR / FCL-Net

This is the pytorch implementation of FCL-Net, accepted by NN'2022.
12 stars 5 forks source link

mistakes #1

Closed ZhouCX117 closed 2 years ago

ZhouCX117 commented 2 years ago

Hi, thanks for your excellent work! There are several mistakes in the code, and I can't run the code. Could you please check your code?

DREAMXFAR commented 2 years ago

Thank you for your advice. We have checked that the default settings could work normally when releasing the code. You can follow the instructions to download the datasets and train the network. Be careful to check the path settings. We suggest that you can try again and check the original settings first. And we will also check the code again very soon.

ZhouCX117 commented 2 years ago

I use the RCF settings, and there are some small mistakes, but I can't remember where they are. Thanks a lot!

DREAMXFAR commented 2 years ago

We have checked the mistakes in the code, and you can download the new version to check. We recommend that you install the same version of packages, like pytorch, with our implementation, because there are some usage changes between different versions that may cause errors. And remember to check the path settings.

ZhouCX117 commented 2 years ago

@DREAMXFAR Thanks a lot! By the way, could you release the full eval code? It seems that there is a little change from the initial eval code. Thanks again!

ZhouCX117 commented 2 years ago

@DREAMXFAR Hi, I reproduced your code for the BDCN setting and obtained 0.797 ODS, which is lower than the 0.807 you provided. For the FCL setting, I obtained 0.803 ODS, which is slightly low. Is this result stable? Looking forward to your answer!

DREAMXFAR commented 2 years ago

Did you use the PASCAL Context dataset for augmentation?

ZhouCX117 commented 2 years ago

No, this is the result without the PASCAL Context dataset, which corresponds to 0.807 ODS for the FCL and BDCN settings in the table you listed. And the results I obtained are 0.797 for the BDCN and 0.803 for the FCL.

DREAMXFAR commented 2 years ago

Can you provide more detailed information for your reproduced results? I wonder how many epochs did you train, and have you ever tested the model on the other epochs. In our experiment, we find the best model often appears at around epoch 12. If you completely follow our settings, it may be a possible reason.

ZhouCX117 commented 2 years ago

@DREAMXFAR Thanks for your reply, I fully follow your settings and run 20 epochs. I test the model at the dsn6 of the last epoch. I will test the dsn6 of the other epochs, thanks a lot!

DREAMXFAR commented 2 years ago

Please note that our final output is dsn7 for FCL. And you can also evaluate other outputs of our FCL for a check. Moreover, our reproduced results of BDCN employ the code by the author He., instead of our implementation in this project which performs a little lower than He's. So, we recommend that you use the original implementation to reproduce BDCN.

ZhouCX117 commented 2 years ago

Thanks a lot for providing useful information! Have a good day!

ZhouCX117 commented 2 years ago

@DREAMXFAR Hi, I have two more questions. 1. In your data loader code, you resize the images in {0.5,1,1.5}. Then what images do you use in the training? The "aug_data" in HED-BSDS or "{aug_data, aug_data_scale0.5, aug_data_scale_1.5}"? I see the data lst you showed is the latter, but I still want to confirm this. 2. Why you ReflectionPad the image with 21 in the first step? The data augmentation contains ReflectionPad, do they have a relationship?

DREAMXFAR commented 2 years ago

Firstly, we use the {aug_data, aug_data_scale0.5, aug_data_scale_1.5}. We further resize the images in {0.5, 1, 1.5} for multi-scale training. Second, since our project is based on HED by chongruo, we just use the ReflectionPad operation as the chongruo's. The flip augmentation has no relationship with ReflectionPad here, and the flip is employed randomly on the input.

ZhouCX117 commented 2 years ago

Thanks for your reply!

ZhouCX117 commented 2 years ago

@DREAMXFAR Hi, it is me again. I have some problems with the CUDA memory. When I run the code in "https://github.com/balajiselvaraj1601/RCF_Pytorch_Updated" for RCF, the GPU memory is about 4217M. However, the GPU memory in the FCL-Net for the RCF method is about 8609M, which is twice that higher. Could you please tell me why this happens? Thanks a lot!

DREAMXFAR commented 2 years ago

We didn't run the project you mentioned here before. Since we reported the size of model parameters in our paper and it conforms with the BDCN paper, we think the reason may not relate to the network structure, but the input data. A possible reason, we suppose, is that our multi-scale training strategy causes an increase in CUDA memory. If it is not the reason, we will advise you to further check the CUDA memory line by line.

ZhouCX117 commented 2 years ago

Hi, it is me again! I want to simplify your code so I can add some revisions. However, the result becomes noise after simply. Could you give me a favor? I debug line by line but find no question. `# -- coding:utf-8 -- import math import torch import torch.nn as nn import torchvision.models as models import torch.utils.model_zoo as model_zoo import torch.nn.functional as F import numpy as np from .NetModules import MSBlock, ClsHead from .LSTM import ConvLSTMCell, ConvLSTMCell_v2

class FCL(nn.Module): def init(self, cfg, writer): super(FCL, self).init()

    self.cfg = cfg
    self.writer = writer

    ############################ Model ###################################
    self.first_padding = nn.ReflectionPad2d(self.cfg.MODEL.first_pad)  # padding left,right,top,bottom
    ### vgg16
    backbone_mode = self.cfg.MODEL.backbone
    pretrained = self.cfg.MODEL.pretrained
    vgg16 = models.vgg16(pretrained=False)
    if backbone_mode == 'vgg16':
        vgg16 = models.vgg16(pretrained=pretrained).cuda()
        if pretrained:
            pre = torch.load('./models/vgg16_bn-6c64b313.pth')
            vgg16.load_state_dict(pre)
    elif self.cfg.MODEL.backbone == 'vgg16_bn':
        vgg16 = models.vgg16_bn(pretrained=False).cuda()
        if pretrained:
            pre = torch.load('/home/zhoucaixia/workspace/FCL-Net/pytorch_net//models/vgg16_bn-6c64b313.pth')
            vgg16.load_state_dict(pre)

    # extract conv layers from vgg
    self.conv1_1 = self.extract_layer(vgg16, backbone_mode, 1)
    self.conv1_2 = self.extract_layer(vgg16, backbone_mode, 2)
    self.conv2_1 = self.extract_layer(vgg16, backbone_mode, 3)
    self.conv2_2 = self.extract_layer(vgg16, backbone_mode, 4)
    self.conv3_1 = self.extract_layer(vgg16, backbone_mode, 5)
    self.conv3_2 = self.extract_layer(vgg16, backbone_mode, 6)
    self.conv3_3 = self.extract_layer(vgg16, backbone_mode, 7)
    self.conv4_1 = self.extract_layer(vgg16, backbone_mode, 8)
    self.conv4_2 = self.extract_layer(vgg16, backbone_mode, 9)
    self.conv4_3 = self.extract_layer(vgg16, backbone_mode, 10)
    self.conv5_1 = self.extract_layer(vgg16, backbone_mode, 11)
    self.conv5_2 = self.extract_layer(vgg16, backbone_mode, 12)
    self.conv5_3 = self.extract_layer(vgg16, backbone_mode, 13)

    # eltwise layers
    self.dsn1_1 = MSBlock(64, rate=4)
    self.dsn1_2 = MSBlock(64, rate=4)
    self.dsn2_1 = MSBlock(128, rate=4)
    self.dsn2_2 = MSBlock(128, rate=4)
    self.dsn3_1 = MSBlock(256, rate=4)
    self.dsn3_2 = MSBlock(256, rate=4)
    self.dsn3_3 = MSBlock(256, rate=4)
    self.dsn4_1 = MSBlock(512, rate=4)
    self.dsn4_2 = MSBlock(512, rate=4)
    self.dsn4_3 = MSBlock(512, rate=4)
    self.dsn5_1 = MSBlock(512, rate=4)
    self.dsn5_2 = MSBlock(512, rate=4)
    self.dsn5_3 = MSBlock(512, rate=4)

    self.dsn5 = nn.Conv2d(21, 1, 1)

    for m in self.conv5_1:
        if isinstance(m, nn.MaxPool2d):
            m.stride = 1
            m.padding = 1

    self.lstmcell_1 = ConvLSTMCell(input_channels=21, hidden_channels=1, kernel_size=3)
    self.lstmcell_2 = ConvLSTMCell(input_channels=21, hidden_channels=1, kernel_size=3)
    self.lstmcell_3 = ConvLSTMCell(input_channels=21, hidden_channels=1, kernel_size=3)
    self.lstmcell_4 = ConvLSTMCell(input_channels=21, hidden_channels=1, kernel_size=3)
    self.lstmcell_5 = ConvLSTMCell(input_channels=21, hidden_channels=1, kernel_size=3)
    # deconv layers
    self.dsn2_up = nn.ConvTranspose2d(21, 21, 4, stride=2)
    self.dsn3_up = nn.ConvTranspose2d(21, 21, 8, stride=4)
    self.dsn4_up = nn.ConvTranspose2d(21, 21, 16, stride=8)
    #self.dsn5_up = nn.ConvTranspose2d(21, 21, 16, stride=8)
    self.dsn5_up = nn.ConvTranspose2d(1, 1, 16, stride=8)
    self.lstm_up_mode = 'deconv'  # or 'deconv' change to bilinear 2020-08-15  

    self.other_layers = [self.dsn1_1, self.dsn1_2,
                         self.dsn2_1, self.dsn2_2,
                         self.dsn3_1, self.dsn3_2, self.dsn3_3,
                         self.dsn4_1, self.dsn4_2, self.dsn4_3,
                         self.dsn5_1, self.dsn5_2, self.dsn5_3,
                         self.dsn5]

    self.other_layers += [self.dsn2_up, self.dsn3_up, self.dsn4_up, self.dsn5_up]

    self.other_layers += [self.conv1_1, self.conv1_2,
                              self.conv2_1, self.conv2_2,
                              self.conv3_1, self.conv3_2, self.conv3_3,
                              self.conv4_1, self.conv4_2, self.conv4_3,
                              self.conv5_1, self.conv5_2, self.conv5_3]
    # added  2020-08-24
    self.dsn1_cls = nn.Conv2d(21, 1, 1)
    self.dsn2_cls = nn.Conv2d(21, 1, 1)
    self.dsn3_cls = nn.Conv2d(21, 1, 1)
    self.dsn4_cls = nn.Conv2d(21, 1, 1)
    self.dsn5_cls = nn.Conv2d(21, 1, 1)
    self.dsn2_up_cls = nn.ConvTranspose2d(1, 1, 4, stride=2)
    self.dsn3_up_cls = nn.ConvTranspose2d(1, 1, 8, stride=4)
    self.dsn4_up_cls = nn.ConvTranspose2d(1, 1, 16, stride=8)
    self.dsn5_up_cls = nn.ConvTranspose2d(1, 1, 16, stride=8)
    self.other_layers += [self.dsn1_cls, self.dsn2_cls, self.dsn3_cls, self.dsn4_cls, self.dsn5_cls,
                            self.dsn2_up_cls, self.dsn3_up_cls, self.dsn4_up_cls, self.dsn5_up_cls]
    self.cls_head = ClsHead(5, maxmode=self.cfg.MODEL.cls_mode)  # max->softmax 1015 # softmax->max 1011
    self.new_score_weighting = nn.Conv2d(6, 1, 1)  # 0924

    ############################ Layer Initialization ###################################
    def weights_init(m):
        if isinstance(m, nn.Conv2d):
            if self.cfg.MODEL.init_mode == 'Gaussian':
                m.weight.data.normal_(0, 0.1)
                m.bias.data.normal_(0, 0.01)
            elif self.cfg.MODEL.init_mode == 'xavier':
                nn.init.xavier_normal_(m.weight.data)
                m.bias.data.fill_(0)
        elif isinstance(m, nn.ConvTranspose2d):
            m.weight.data.normal_(0, 0.2)
            m.bias.data.fill_(0)

    for each_layer in self.other_layers:
        each_layer.apply(weights_init)

    self.new_score_weighting.weight.data.fill_(0.2)
    self.new_score_weighting.bias.data.fill_(0)

def forward(self, x):
    h, w = x.shape[2:]

    # backbone
    x = self.first_padding(x) #padding=21,(x+21*2,y+21*2)

    ############################# pipeline #####################################
    ### conv1 ------------------------------------------------------------------
    self.conv1_1_output = self.conv1_1(x) #尺寸不改变,通道变化
    self.conv1_2_output = self.conv1_2(self.conv1_1_output) #尺寸不改变,通道变化

    dsn1_1 = self.dsn1_1(self.conv1_1_output) #尺寸不改变,通道变化,通道为21
    dsn1_2 = self.dsn1_2(self.conv1_2_output) #尺寸不改变,通道变化,通道为21

    ### conv2 ------------------------------------------------------------------
    self.conv2_1_output = self.conv2_1(self.conv1_2_output)#尺寸减半
    self.conv2_2_output = self.conv2_2(self.conv2_1_output)#尺寸相对上层不变,相对原始减半

    dsn2_1 = self.dsn2_1(self.conv2_1_output) #将self.conv2_1_output通道由128变为21
    dsn2_2 = self.dsn2_2(self.conv2_2_output)#将self.conv2_2_output通道由128变为21

    ### conv3 ------------------------------------------------------------------
    self.conv3_1_output = self.conv3_1(self.conv2_2_output)#相对原始变为1/4
    self.conv3_2_output = self.conv3_2(self.conv3_1_output)
    self.conv3_3_output = self.conv3_3(self.conv3_2_output)
    dsn3_1 = self.dsn3_1(self.conv3_1_output)
    dsn3_2 = self.dsn3_2(self.conv3_2_output)
    dsn3_3 = self.dsn3_3(self.conv3_3_output)

    ### conv4 ------------------------------------------------------------------
    self.conv4_1_output = self.conv4_1(self.conv3_3_output)
    self.conv4_2_output = self.conv4_2(self.conv4_1_output)
    self.conv4_3_output = self.conv4_3(self.conv4_2_output)
    dsn4_1 = self.dsn4_1(self.conv4_1_output)
    dsn4_2 = self.dsn4_2(self.conv4_2_output)
    dsn4_3 = self.dsn4_3(self.conv4_3_output)

    ### conv5 ------------------------------------------------------------------
    self.conv5_1_output = self.conv5_1(self.conv4_3_output)
    self.conv5_2_output = self.conv5_2(self.conv5_1_output)
    self.conv5_3_output = self.conv5_3(self.conv5_2_output)
    ### dsn5
    dsn5_1 = self.dsn5_1(self.conv5_1_output)
    dsn5_2 = self.dsn5_2(self.conv5_2_output)
    dsn5_3 = self.dsn5_3(self.conv5_3_output)

    dsn5_up = self.dsn5(dsn5_1 + dsn5_2 + dsn5_3)
    dsn5_up = self.dsn5_up(dsn5_up)
    dsn5_final = self.crop_layer(dsn5_up, h, w)

    dsn4_add_up = self.dsn4_up(dsn4_1 + dsn4_2 + dsn4_3)
    dsn5_up = self.crop_layer(dsn5_up, dsn4_add_up.shape[2], dsn4_add_up.shape[3])
    hs5 = None
    hs4, dsn4_up = self.lstmcell_4(dsn4_add_up, hs5, dsn5_up)
    dsn4_final = self.crop_layer(dsn4_up, h, w)

    dsn3_add_up = self.dsn3_up(dsn3_1 + dsn3_2 + dsn3_3)
    dsn4_up = self.crop_layer(dsn4_up, dsn3_add_up.shape[2], dsn3_add_up.shape[3])
    hs4 = self.crop_layer(hs4, dsn3_add_up.shape[2], dsn3_add_up.shape[3])
    hs3, dsn3_up = self.lstmcell_3(dsn3_add_up, hs4, dsn4_up)
    dsn3_final = self.crop_layer(dsn3_up, h, w)

    dsn2_add_up = self.dsn2_up(dsn2_1 + dsn2_2)
    dsn3_up = self.crop_layer(dsn3_up, dsn2_add_up.shape[2], dsn2_add_up.shape[3])
    hs3 = self.crop_layer(hs3, dsn2_add_up.shape[2], dsn2_add_up.shape[3])
    hs2, dsn2_up = self.lstmcell_2(dsn2_add_up, hs3, dsn3_up)
    dsn2_final = self.crop_layer(dsn2_up, h, w)

    dsn1_add_up = dsn1_1 + dsn1_2
    dsn2_up = self.crop_layer(dsn2_up, dsn1_add_up.shape[2], dsn1_add_up.shape[3])
    hs2 = self.crop_layer(hs2, dsn1_add_up.shape[2], dsn1_add_up.shape[3])
    hs1, dsn1_up = self.lstmcell_1(dsn1_add_up, hs2, dsn2_up)
    dsn1_final = self.crop_layer(dsn1_up, h, w)

    p1_1 = dsn1_final
    p2_1 = dsn2_final
    p3_1 = dsn3_final
    p4_1 = dsn4_final
    p5_1 = dsn5_final

    concat = torch.cat((p1_1, p2_1, p3_1, p4_1, p5_1), 1)

    dsn1_cls_score_up = self.dsn1_cls(dsn1_1 + dsn1_2)
    dsn2_cls_score = self.dsn2_cls(dsn2_1 + dsn2_2)
    dsn2_cls_score_up = self.dsn2_up_cls(dsn2_cls_score)
    dsn3_cls_score = self.dsn3_cls(dsn3_1 + dsn3_2 + dsn3_3)
    dsn3_cls_score_up = self.dsn3_up_cls(dsn3_cls_score)
    dsn4_cls_score = self.dsn4_cls(dsn4_1 + dsn4_2 + dsn4_3)
    dsn4_cls_score_up = self.dsn4_up_cls(dsn4_cls_score)
    dsn5_cls_score = self.dsn5_cls(dsn5_1 + dsn5_2 + dsn5_3)
    dsn5_cls_score_up = self.dsn5_up_cls(dsn5_cls_score)

    score = [dsn1_cls_score_up, dsn2_cls_score_up, dsn3_cls_score_up, dsn4_cls_score_up, dsn5_cls_score_up]
    min_h = min([i.shape[2] for i in score])
    min_w = min([i.shape[3] for i in score])
    for i in range(len(score)):
        score[i] = self.crop_layer(score[i], min_h, min_w)

    concat_score = torch.cat(score, 1)
    score_final = self.cls_head(concat_score)
    score_final = self.crop_layer(score_final, h, w)
    dsn6_final = torch.sum(concat * score_final, axis=1).unsqueeze(0)
    concat = torch.cat((concat, dsn6_final), 1)  # 0924
    dsn7_final = self.new_score_weighting(concat)

    return p1_1, p2_1, p3_1, p4_1, p5_1, dsn6_final, dsn7_final  # 0902 add dsn7_final

def train(self, mode=True):
    """
    Override the default train() to freeze the BN parameters
    """
    super(FCL, self).train(mode)

    contain_bn_layers = [self.conv1_1, self.conv1_2,
                         self.conv2_1, self.conv2_2,
                         self.conv3_1, self.conv3_2, self.conv3_3,
                         self.conv4_1, self.conv4_2, self.conv4_3,
                         self.conv5_1, self.conv5_2, self.conv5_3]

    if self.cfg.MODEL.freeze_bn:
        # print("----Freezing Mean/Var of BatchNorm2D.")

        for each_block in contain_bn_layers:
            for m in each_block.modules():
                if isinstance(m, nn.BatchNorm2d):
                    # print("---- in bn layer")
                    # print(m)
                    m.eval()

                    if self.cfg.MODEL.freeze_bn_affine:
                        # print("---- Freezing Weight/Bias of BatchNorm2D.")
                        m.weight.requires_grad = False
                        m.bias.requires_grad = False

def extract_layer(self, model, backbone_mode, ind):
    index_dict = {}
    if backbone_mode == 'vgg16':
        index_dict = {
            1: (0, 4),
            2: (4, 9),
            3: (9, 16),
            4: (16, 23),
            5: (23, 30)}
    elif backbone_mode == 'vgg16_bn':
        index_dict = {
            1: (0, 3),
            2: (3, 6),
            3: (6, 10),
            4: (10, 13),
            5: (13, 17),
            6: (17, 20),
            7: (20, 23),
            8: (23, 27),
            9: (27, 30),
            10: (30, 33),
            11: (33, 37),
            12: (37, 40),
            13: (40, 43)}  # 从ReLU 结束

    start, end = index_dict[ind]
    modified_model = nn.Sequential(*list(model.features.children())[start:end])

    on = False
    if on:
        for m in modified_model:
            if isinstance(m, nn.MaxPool2d):
                m.ceil_mode = True

    ### dilated conv
    # for m in modified_model:
    # if isinstance(m, nn.Conv2d):
    # m.dilation = (2, 2)
    # m.padding = (2, 2)

    return modified_model

def make_bilinear_weights(self, size, num_channels):
    factor = (size + 1) // 2
    if size % 2 == 1:
        center = factor - 1
    else:
        center = factor - 0.5
    og = np.ogrid[:size, :size]
    filt = (1 - abs(og[0] - center) / factor) * (1 - abs(og[1] - center) / factor)
    # print(filt)
    filt = torch.from_numpy(filt)
    w = torch.zeros(num_channels, num_channels, size, size)
    w.requires_grad = False
    for i in range(num_channels):
        for j in range(num_channels):
            if i == j:
                w[i, j] = filt
    return w

def crop_layer(self, x, h, w):
    input_h, input_w = x.shape[2:]
    ref_h, ref_w = h, w

    assert (input_h > ref_h, "input_h should be larger than ref_h")
    assert (input_w > ref_w, "input_w should be larger than ref_w")

    # h_start = math.floor( (input_h - ref_h) / 2 )
    # w_start = math.floor( (input_w - ref_w) / 2 )
    h_start = int(round((input_h - ref_h) / 2))
    w_start = int(round((input_w - ref_w) / 2))
    x_new = x[:, :, h_start:h_start + ref_h, w_start:w_start + ref_w]

    return x_new

`

DREAMXFAR commented 2 years ago

I checked the logic of the code and it seemed right. So the problem may result from some details. When I have spare time, I will further revisit the code here. And here are some suggestions. Since you have the original code that works right, you can compare the values of each variable and check if there are some differences. Note that a fixed random seed will be needed for repeatable results.

ZhouCX117 commented 2 years ago

Thanks for your favor. I fixed the random seed, but the random initialization for the layer dsn1_1, dsn2_1... is different, and I can't solve it. The random seed code is as follows: random_seed = cfg.TRAIN.random_seed if random_seed > 0: os.environ["PYTHONHASHSEED"]=str(random_seed) random.seed(random_seed) torch.manual_seed(random_seed) torch.cuda.manual_seed(random_seed) torch.cuda.manual_seed_all(random_seed) numpy.random.seed(random_seed) torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False

DREAMXFAR commented 2 years ago

The code you listed did fix the random seed. Generally, the initialization of dsn1_1 and dsn2_1 will be different, because the random seed will change after it is used. But the parameter of dsn1_1 will be the same all the time, and so does dsn2_1. The fixed dsn1_1 and dsn2_1 et.al. will be enough to compare the difference between the original code and your simplified version.

If you are still confused about random seed settings, you can just initialize dsn1_1.weight and dsn1_1.bias as constants.

An example for illustration.

import numpy as np

# different output 
np.random.seed(1)
for i in range(3):
    print(np.random.random())

# the same output
for i in range(3):
    np.random.seed(1)
    print(np.random.random())
ZhouCX117 commented 2 years ago

Your suggestion is valuable, thanks again! I initialize dsn1_1.weight and dsn1_1.bias as constants and get the same value and the loss value. However, the result is strange and I do not have any idea! Could you give me some suggestions? And I don't know how to send files to you. The result is as follows. image image

DREAMXFAR commented 2 years ago

I wonder why you say the outputs are strange? You'd better provide more details. And I think the results here are not strange as it detects edges. For convenience, you can email me at 1216624099@qq.com.