Question about ablation studies in paper

qin2294096 commented 3 years ago

Hi, thanks for your work and repo. I'm very interested in the VFL, which combines classification scores and location scores in the targets. Then I have some questions about VFL.

In table 3 of the paper, the first row represents the results of the raw VFNet trained with the focal loss. What is raw VFNet? Is it FCOS+ATSS with the centerness branch removed？
If not, have you compared the performances between applying VFL to FCOS+ATSS with the centerness branch removed and applying FL to FCOS+ATSS(with the centerness branch) ？

Thank you very much!

hyz-xmaster commented 3 years ago

Hi, thank you for interesting in our work.

Yes, your understanding is correct. The raw VFNet is the FCOS+ATSS without the centerness branch.
You can learn from our paper that the FCOS+ATSS with FL achieves 39.2 AP (Table 1) and raw VFNet + VFL 40.1 AP (Table 3).

qin2294096 commented 3 years ago

Oh, that means we don't need to use centerness in FCOS when applying VFL. So it is a very good loss function!

Thanks for your fast reply.

hyz-xmaster commented 3 years ago

My pleasure.

zzzmm1 commented 3 years ago

Hi, thanks for your good work! Could you please provide demo code for the experiments in Table 1, e.g the AP 74.7 using gt_cls_iou?

hyz-xmaster commented 3 years ago

Hi, the implementation is tricky and making it public needs a lot of work. I can tell you roughly how I achieved it and share some pieces of the code.

The main idea is using the training phase to get the ground-truth values.

Download the pre-trained model of FCOS and use it as the starting point of training.
Set the learning rate of all parameters of the model as 0.0 during training except a few unimportant parameters whose lr can be set as a very small value, say 0.0000001. This can make the training proceed with some paras being optimized. (A simple alternative would be setting the overall lr = 0.0). In effect, training with lr=0.0 will not change the paras of the model and also the performance. This step can be done in the train.py of MMDetection.
Use the val2007 split as the training set in the config file.
Do some work in loss computation function of fcos_head. Here I provide my example code which I implemented in MMDetection v1.1.0 only for your understanding.

        # # replace cls_scores with real IoUs to see the performance
        a_ious = bbox_overlaps(pos_decoded_bbox_preds.detach(), pos_decoded_target_preds.detach(),
                              is_aligned=True).clamp(min=1e-6)
        a_ious_logits = torch.log(a_ious/(1.0 - a_ious))
        a_label_targets = flatten_labels[pos_inds]
        flatten_cls_scores_new = flatten_cls_scores.detach().clone()
        flatten_cls_scores_new[pos_inds, a_label_targets-1] = a_ious_logits

        # flatten_bbox_preds_new = flatten_bbox_preds.detach().clone()
        # flatten_bbox_preds_new[pos_inds] = pos_bbox_targets

        # flatten_centerness_new = flatten_centerness.detach().clone()
        # # global replacement
        # flatten_centerness_new[:] = torch.log(flatten_centerness_new.new_tensor(0.000000001))
        # flatten_centerness_new[pos_inds] = torch.log(pos_centerness_targets /
        #                                              (1.0 - pos_centerness_targets + 0.000001))

        # # the number of points per img, per lvl
        num_points = [points.size(0) for points in all_level_points]
        cls_score_list = list(flatten_cls_scores_new.detach().split(num_points, 0))
        bbox_pred_list = list(flatten_bbox_preds.detach().split(num_points, 0))
        centerness_list = list(flatten_centerness.detach().split(num_points, 0))
        img_shape = img_metas[0]['img_shape']
        scale_factor = img_metas[0]['scale_factor']
        cfg.score_thr = 0.05
        cfg.nms = dict(type='nms', iou_thr=0.5)
        cfg.max_per_img = 100
        result_list = []
        det_results = self.get_bboxes_single_analysis(cls_score_list, bbox_pred_list, centerness_list,
                                                     all_level_points, img_shape, scale_factor, cfg)
        result_list.append(det_results)
        bbox_results = [
            bbox2result(det_bboxes, det_labels, self.num_classes)
            for det_bboxes, det_labels in result_list
        ]
        img_name = img_metas[0]['filename'].split('/')[-1][:-4]
        save_name = '~/mmdet/work_dirs/fcos_analysis_gt_cls_iou/' + img_name + '.pt'
        with open(save_name, 'wb') as f:
            pickle.dump(bbox_results, f)

     def get_bboxes_single_analysis(self,
                                   cls_scores,
                                   bbox_preds,
                                   centernesses,
                                   mlvl_points,
                                   img_shape,
                                   scale_factor,
                                   cfg,
                                   rescale=True):
        assert len(cls_scores) == len(bbox_preds) == len(mlvl_points)
        mlvl_bboxes = []
        mlvl_scores = []
        mlvl_centerness = []
        for cls_score, bbox_pred, centerness, points in zip(
                cls_scores, bbox_preds, centernesses, mlvl_points):
            assert cls_score.size(0) == bbox_pred.size(0)
            scores = cls_score.detach().sigmoid()
            centerness = centerness.detach().sigmoid()
            bbox_pred = bbox_pred.detach()

            nms_pre = cfg.get('nms_pre', -1)
            if nms_pre > 0 and scores.shape[0] > nms_pre:
                max_scores, _ = (scores * centerness[:, None]).max(dim=1)
                # max_scores, _ = scores.max(dim=1)
                _, topk_inds = max_scores.topk(nms_pre)
                points = points[topk_inds, :]
                bbox_pred = bbox_pred[topk_inds, :]
                scores = scores[topk_inds, :]
                centerness = centerness[topk_inds]
            bboxes = distance2bbox(points, bbox_pred, max_shape=img_shape)
            mlvl_bboxes.append(bboxes)
            mlvl_scores.append(scores)
            mlvl_centerness.append(centerness)
        mlvl_bboxes = torch.cat(mlvl_bboxes)
        if rescale:
            mlvl_bboxes /= mlvl_bboxes.new_tensor(scale_factor)
        mlvl_scores = torch.cat(mlvl_scores)
        padding = mlvl_scores.new_zeros(mlvl_scores.shape[0], 1)
        mlvl_scores = torch.cat([padding, mlvl_scores], dim=1)
        mlvl_centerness = torch.cat(mlvl_centerness)
        det_bboxes, det_labels, det_probs = multiclass_nms(
            mlvl_bboxes,
            mlvl_scores,
            cfg.score_thr,
            cfg.nms,
            cfg.max_per_img,
            score_factors=mlvl_centerness)
        return det_bboxes, det_labels

Merge those per-image results and convert into the format that the evaluation code requires.
Finally, evaluate it!

Hope this helps and good luck.

feiyuhuahuo commented 3 years ago

Hi @hyz-xmaster , for comparisons in table5, does the VFL mean the loss in equation 2 or means that loss in equation 2 + Star Dconv + BBox refinement?

hyz-xmaster commented 3 years ago

Hi @feiyuhuahuo, in Table 5, VFL means only the loss. In the last row, VFNet + VFL = VFL loss + Star DConv + BBox refinement, and similarly VFNet + FL = FL loss + Star DConv + BBox refinement.

youngwanLEE commented 3 years ago

@hyz-xmaster Hi,

I'm confused about the comparison in Table 1 and Table 3.

Is the first row raw VFL (39.0AP) in Table 3 and the first column FCOS + ATSS w/o ctr (38.5AP) in Table 1 the same?

hyz-xmaster commented 3 years ago

Hi @youngwanLEE, raw VFL (39.0 AP) in Table 3 has the same structure with FCOS+ATSS without centerness branch and it is retrained. FCOS + ATSS w/o ctr (38.5 AP) in Table 1 means FCOS + ATSS is trained with the centerness branch but centerness scores are not used in inference. So the difference is if the centerness branch is used in training.

yingyu13 commented 3 years ago

@hyz-xmaster , hi, i wonder why you use log function for iou_logits and centerness when replacing the value with ground-truth targets in example code.

hyz-xmaster commented 3 years ago

Hi @yingyu13 , log(x / (1 - x)) is the inverse function of sigmoid.

yingyu13 commented 3 years ago

Thanks! I got it.

Icecream-blue-sky commented 2 years ago

Hi, the implementation is tricky and making it public needs a lot of work. I can tell you roughly how I achieved it and share some pieces of the code.

The main idea is using the training phase to get the ground-truth values.

Download the pre-trained model of FCOS and use it as the starting point of training.
Set the learning rate of all parameters of the model as 0.0 during training except a few unimportant parameters whose lr can be set as a very small value, say 0.0000001. This can make the training proceed with some paras being optimized. (A simple alternative would be setting the overall lr = 0.0). In effect, training with lr=0.0 will not change the paras of the model and also the performance. This step can be done in the train.py of MMDetection.
Use the val2007 split as the training set in the config file.
Do some work in loss computation function of fcos_head. Here I provide my example code which I implemented in MMDetection v1.1.0 only for your understanding.

        # # replace cls_scores with real IoUs to see the performance
        a_ious = bbox_overlaps(pos_decoded_bbox_preds.detach(), pos_decoded_target_preds.detach(),
                              is_aligned=True).clamp(min=1e-6)
        a_ious_logits = torch.log(a_ious/(1.0 - a_ious))
        a_label_targets = flatten_labels[pos_inds]
        flatten_cls_scores_new = flatten_cls_scores.detach().clone()
        flatten_cls_scores_new[pos_inds, a_label_targets-1] = a_ious_logits

        # flatten_bbox_preds_new = flatten_bbox_preds.detach().clone()
        # flatten_bbox_preds_new[pos_inds] = pos_bbox_targets

        # flatten_centerness_new = flatten_centerness.detach().clone()
        # # global replacement
        # flatten_centerness_new[:] = torch.log(flatten_centerness_new.new_tensor(0.000000001))
        # flatten_centerness_new[pos_inds] = torch.log(pos_centerness_targets /
        #                                              (1.0 - pos_centerness_targets + 0.000001))

        # # the number of points per img, per lvl
        num_points = [points.size(0) for points in all_level_points]
        cls_score_list = list(flatten_cls_scores_new.detach().split(num_points, 0))
        bbox_pred_list = list(flatten_bbox_preds.detach().split(num_points, 0))
        centerness_list = list(flatten_centerness.detach().split(num_points, 0))
        img_shape = img_metas[0]['img_shape']
        scale_factor = img_metas[0]['scale_factor']
        cfg.score_thr = 0.05
        cfg.nms = dict(type='nms', iou_thr=0.5)
        cfg.max_per_img = 100
        result_list = []
        det_results = self.get_bboxes_single_analysis(cls_score_list, bbox_pred_list, centerness_list,
                                                     all_level_points, img_shape, scale_factor, cfg)
        result_list.append(det_results)
        bbox_results = [
            bbox2result(det_bboxes, det_labels, self.num_classes)
            for det_bboxes, det_labels in result_list
        ]
        img_name = img_metas[0]['filename'].split('/')[-1][:-4]
        save_name = '~/mmdet/work_dirs/fcos_analysis_gt_cls_iou/' + img_name + '.pt'
        with open(save_name, 'wb') as f:
            pickle.dump(bbox_results, f)

     def get_bboxes_single_analysis(self,
                                   cls_scores,
                                   bbox_preds,
                                   centernesses,
                                   mlvl_points,
                                   img_shape,
                                   scale_factor,
                                   cfg,
                                   rescale=True):
        assert len(cls_scores) == len(bbox_preds) == len(mlvl_points)
        mlvl_bboxes = []
        mlvl_scores = []
        mlvl_centerness = []
        for cls_score, bbox_pred, centerness, points in zip(
                cls_scores, bbox_preds, centernesses, mlvl_points):
            assert cls_score.size(0) == bbox_pred.size(0)
            scores = cls_score.detach().sigmoid()
            centerness = centerness.detach().sigmoid()
            bbox_pred = bbox_pred.detach()

            nms_pre = cfg.get('nms_pre', -1)
            if nms_pre > 0 and scores.shape[0] > nms_pre:
                max_scores, _ = (scores * centerness[:, None]).max(dim=1)
                # max_scores, _ = scores.max(dim=1)
                _, topk_inds = max_scores.topk(nms_pre)
                points = points[topk_inds, :]
                bbox_pred = bbox_pred[topk_inds, :]
                scores = scores[topk_inds, :]
                centerness = centerness[topk_inds]
            bboxes = distance2bbox(points, bbox_pred, max_shape=img_shape)
            mlvl_bboxes.append(bboxes)
            mlvl_scores.append(scores)
            mlvl_centerness.append(centerness)
        mlvl_bboxes = torch.cat(mlvl_bboxes)
        if rescale:
            mlvl_bboxes /= mlvl_bboxes.new_tensor(scale_factor)
        mlvl_scores = torch.cat(mlvl_scores)
        padding = mlvl_scores.new_zeros(mlvl_scores.shape[0], 1)
        mlvl_scores = torch.cat([padding, mlvl_scores], dim=1)
        mlvl_centerness = torch.cat(mlvl_centerness)
        det_bboxes, det_labels, det_probs = multiclass_nms(
            mlvl_bboxes,
            mlvl_scores,
            cfg.score_thr,
            cfg.nms,
            cfg.max_per_img,
            score_factors=mlvl_centerness)
        return det_bboxes, det_labels

Merge those per-image results and convert into the format that the evaluation code requires.
Finally, evaluate it!

Hope this helps and good luck.

How to merge those per-image results and convert into the format that the evaluation code requires? How to align each per-image results with corresponding gt bboxes? I have successfully reimplement the stage 1-4, but get stacked at stage 5. Could you provide corresponding code of stage 5?

hyz-xmaster commented 2 years ago

How to merge those per-image results and convert into the format that the evaluation code requires? How to align each per-image results with corresponding gt bboxes? I have successfully reimplement the stage 1-4, but get stacked at stage 5. Could you provide corresponding code of stage 5?

Hi, this is the code for my experiment which was done with mmdetection v1.1. You may write your own based on it. Hope this helps.

import numpy as np
from pycocotools.coco import COCO
import os
import pickle
import torch
import json
import matplotlib.pyplot as plt
import numpy as np

coco_gt_file = './mmdet/data/coco/annotations/instances_val2017.json'
coco = COCO(coco_gt_file)

res_dir = './mmdet/work_dirs/fcos_analysis/fcos_atss_analysis_gt_iou_cls_score/'
img_ids = sorted(coco.getImgIds())
results = []
for img_id in img_ids:
    res_name = os.path.join(res_dir, coco.imgs[img_id]['file_name'][:-4]+'.pt')
    if not os.path.exists(res_name):
        results.append([np.empty((0, 5)) for _ in range(80)])
        continue
    with open(res_name, 'rb') as fp:
        res = pickle.load(fp)
        results.append(res[0])

from mmdet.datasets import build_dataset
import mmcv
config_file = './mmdet/configs/fcos/fcos_analysis_r50_caffe_fpn_gn_1x_4gpu.py'
cfg = mmcv.Config.fromfile(config_file)
cfg.data.test.test_mode = True
dataset = build_dataset(cfg.data.test)
dataset.evaluate(results)

Icecream-blue-sky commented 2 years ago

How to merge those per-image results and convert into the format that the evaluation code requires? How to align each per-image results with corresponding gt bboxes? I have successfully reimplement the stage 1-4, but get stacked at stage 5. Could you provide corresponding code of stage 5?

Hi, this is the code for my experiment which was done with mmdetection v1.1. You may write your own based on it. Hope this helps.
import numpy as np
from pycocotools.coco import COCO
import os
import pickle
import torch
import json
import matplotlib.pyplot as plt
import numpy as np

coco_gt_file = './mmdet/data/coco/annotations/instances_val2017.json'
coco = COCO(coco_gt_file)

res_dir = './mmdet/work_dirs/fcos_analysis/fcos_atss_analysis_gt_iou_cls_score/'
img_ids = sorted(coco.getImgIds())
results = []
for img_id in img_ids:
    res_name = os.path.join(res_dir, coco.imgs[img_id]['file_name'][:-4]+'.pt')
    if not os.path.exists(res_name):
        results.append([np.empty((0, 5)) for _ in range(80)])
        continue
    with open(res_name, 'rb') as fp:
        res = pickle.load(fp)
        results.append(res[0])

from mmdet.datasets import build_dataset
import mmcv
config_file = './mmdet/configs/fcos/fcos_analysis_r50_caffe_fpn_gn_1x_4gpu.py'
cfg = mmcv.Config.fromfile(config_file)
cfg.data.test.test_mode = True
dataset = build_dataset(cfg.data.test)
dataset.evaluate(results)

Thanks!

Icecream-blue-sky commented 2 years ago

How to merge those per-image results and convert into the format that the evaluation code requires? How to align each per-image results with corresponding gt bboxes? I have successfully reimplement the stage 1-4, but get stacked at stage 5. Could you provide corresponding code of stage 5?

Hi, this is the code for my experiment which was done with mmdetection v1.1. You may write your own based on it. Hope this helps.
import numpy as np
from pycocotools.coco import COCO
import os
import pickle
import torch
import json
import matplotlib.pyplot as plt
import numpy as np

coco_gt_file = './mmdet/data/coco/annotations/instances_val2017.json'
coco = COCO(coco_gt_file)

res_dir = './mmdet/work_dirs/fcos_analysis/fcos_atss_analysis_gt_iou_cls_score/'
img_ids = sorted(coco.getImgIds())
results = []
for img_id in img_ids:
    res_name = os.path.join(res_dir, coco.imgs[img_id]['file_name'][:-4]+'.pt')
    if not os.path.exists(res_name):
        results.append([np.empty((0, 5)) for _ in range(80)])
        continue
    with open(res_name, 'rb') as fp:
        res = pickle.load(fp)
        results.append(res[0])

from mmdet.datasets import build_dataset
import mmcv
config_file = './mmdet/configs/fcos/fcos_analysis_r50_caffe_fpn_gn_1x_4gpu.py'
cfg = mmcv.Config.fromfile(config_file)
cfg.data.test.test_mode = True
dataset = build_dataset(cfg.data.test)
dataset.evaluate(results)

Why the result is arranged by img_id (I mean img_ids = sorted(coco.getImgIds()))? Is it because the annotation info from dataset = build_dataset(cfg.data.test) is arranged by img_id?

hyz-xmaster commented 2 years ago

Why the result is arranged by img_id (I mean img_ids = sorted(coco.getImgIds()))? Is it because the annotation info from dataset = build_dataset(cfg.data.test) is arranged by img_id?

To make the order of the result consistent with the order of ground truth used the evaluation code, I think you also need to add one similar line: img_ids = sorted(coco.getImgIds()) to the COCO api used by the evaluation code.

Icecream-blue-sky commented 2 years ago

Why the result is arranged by img_id (I mean img_ids = sorted(coco.getImgIds()))? Is it because the annotation info from dataset = build_dataset(cfg.data.test) is arranged by img_id?

To make the order of the result consistent with the order of ground truth used the evaluation code, I think you also need to add one similar line: img_ids = sorted(coco.getImgIds()) to the COCO api used by the evaluation code. All previous problems have been sloved. However, I don't get expected results. Actually, I do experiment on RetinaNet instead of FCOS. I find that, even I don't make any changes to the outpus, the evaluate results is much worse than the testing resuls of pretrained model. I'm wondering if it is the cause of transforms( I mean the transforms in training_pipeline are different from testing_pipeline). Do you change the tranforms of training_pipeline to be the same as that of testing_pipeline? I just use the training_pipeline.
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
There is no MultiScaleFlipAug in training_pipeline. My results of resnet_50_fpn_1x retinanet are below, I just use the outputs in loss() function to evaluate and didn' t make any change:
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.141
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.242
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.137
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.088
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.138
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.200
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.188
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.298
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.317
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.185
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.306
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.451
Raw AP(0.5:0.95)=0.359

hyz-xmaster commented 2 years ago

Do you change the tranforms of training_pipeline to be the same as that of testing_pipeline? I just use the training_pipeline.

No, I didn't change these configurations. Given your results, I guess there might be something wrong in some step of your experiment.

Icecream-blue-sky commented 2 years ago

Do you change the tranforms of training_pipeline to be the same as that of testing_pipeline? I just use the training_pipeline.

No, I didn't change these configurations. Given your results, I guess there might be something wrong in some step of your experiment.

Maybe. But I only make some small changes like below.

in anchor_head.py
def loss(self,
             cls_scores,
             bbox_preds,
             gt_bboxes,
             gt_labels,
             img_metas,
             gt_bboxes_ignore=None,
             **kwargs):
  self.save_bboxes_batch(cls_scores, bbox_preds, img_metas)
  ...

#retinaNet上限实验
def save_bboxes_batch(self, cls_scores, bbox_preds, img_metas):
  results_list = self.get_bboxes(cls_scores, bbox_preds, img_metas, rescale=True)
  from mmdet.core import bbox2result
  #每个类的结果
  bbox_results = [
      bbox2result(det_bboxes, det_labels, self.num_classes)
      for det_bboxes, det_labels in results_list
  ]
  img_name = img_metas[0]['filename'].split('/')[-1][:-4]
  save_name = '/gpfs/home/sist/tqzouustc/code/mmdetection/retinanet_analysis_wo_change_noflip/' + img_name + '.pt'
  import os
  if not os.path.exists(os.path.dirname(save_name)):
      os.mkdir(os.path.dirname(save_name))
  import pickle
  with open(save_name, 'wb') as f:
      pickle.dump(bbox_results, f)

I don't change cls_scores and bbox_preds. And the self.get_bboxes function is the original function in anchor_head.py. I don't change anything of it. Also I don't change other parts of loss() function. So what is wrong in my pipelines? I get stucked here for several days. It makes me feel really upset...

hyz-xmaster commented 2 years ago

I don't change cls_scores and bbox_preds. And the self.get_bboxes function is the original function in anchor_head.py. I don't change anything of it. Also I don't change other parts of loss() function. So what is wrong in my pipelines? I get stucked here for several days. It makes me feel really upset..

I can't figure out exactly what the wrong step is in your code, but I have a wild guess that you should not use the self.get_bboxes function. Instead you should use the self._get_bboxes one and make some changes to it, since in my code I use theself.get_bbox_single (its name has been changed to self._get_bboxes in current version of mmdet) not the self.get_bboxes. I can't remember the reason why I didn't use the self.get_bboxes though.

hyz-xmaster commented 2 years ago

@Icecream-blue-sky, sorry, it suddenly came to my mind that you need to change this line: dict(type='RandomFlip', flip_ratio=0.5) to dict(type='RandomFlip', flip_ratio=0.0) so that the image is not flipped.

Icecream-blue-sky commented 2 years ago

@Icecream-blue-sky, sorry, it suddenly came to my mind that you need to change this line: dict(type='RandomFlip', flip_ratio=0.5) to dict(type='RandomFlip', flip_ratio=0.0) so that the image is not flipped.

Thanks!

Icecream-blue-sky commented 2 years ago

@Icecream-blue-sky, sorry, it suddenly came to my mind that you need to change this line: dict(type='RandomFlip', flip_ratio=0.5) to dict(type='RandomFlip', flip_ratio=0.0) so that the image is not flipped.

You are right! It seems that the bug have been fixed!!!! Thanks for your kind help!!!

hyz-xmaster commented 2 years ago

@Icecream-blue-sky, sorry, it suddenly came to my mind that you need to change this line: dict(type='RandomFlip', flip_ratio=0.5) to dict(type='RandomFlip', flip_ratio=0.0) so that the image is not flipped.

You are right! It seems that the bug have been fixed!!!! Thanks for your kind help!!!

Great to hear that. My pleasure.

hyz-xmaster / VarifocalNet

Question about ablation studies in paper #3