SysCV / transfiner

Mask Transfiner for High-Quality Instance Segmentation, CVPR 2022
https://www.vis.xyz/pub/transfiner
Apache License 2.0
534 stars 61 forks source link

DIfference in the results from HF demo and COlab #22

Closed vanga closed 2 years ago

vanga commented 2 years ago

Hi,

I am trying this model on Colab and I am seeing difference in the results compared to the results on Huggingfaces demo link.

I tried multiple variants (R101-3x-deforn, R50, R50-3x, R50-3x-deform). Huggingface results seem superior to me w.r.t mask accuracy. I have downloaded the models from the drive linkes in the readme.

Here is my code

config_file = "/content/transfiner/configs/transfiner/mask_rcnn_R_50_FPN_3x.yaml"

cfg = get_cfg()
cfg.merge_from_file(config_file)
cfg.MODEL.WEIGHTS = "/content/transfiner/pretrained_models/output_3x_transfiner_r50.pth"

cfg.MODEL.RETINANET.SCORE_THRESH_TEST = .5
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = .5
cfg.MODEL.PANOPTIC_FPN.COMBINE.INSTANCES_CONFIDENCE_THRESH = .5
cfg.freeze()
im = cv2.imread("/content/img_000124785.jpg")
predictor = DefaultPredictor(cfg)
outputs = predictor(im)
v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])

I see this notice in the logs while I do the prediction which looks sucpicious

The checkpoint state_dict contains keys that are not used by the model:
  roi_heads.mask_pooler.conv_norm_relus_semantic.0.{bias, weight}
  roi_heads.mask_pooler.conv_norm_relus_semantic.2.{bias, weight}
  roi_heads.mask_pooler.conv_norm_relus_semantic.4.{bias, weight}
  roi_heads.mask_pooler.conv_norm_relus_semantic.6.{bias, weight}
  roi_heads.mask_head.deconv_bo.{bias, weight}
  roi_heads.mask_head.predictor_bo.{bias, weight}
  roi_heads.mask_head.encoder.layers.0.self_attn.{in_proj_bias, in_proj_weight}
  roi_heads.mask_head.encoder.layers.0.self_attn.out_proj.{bias, weight}
  roi_heads.mask_head.encoder.layers.0.linear1.{bias, weight}
  roi_heads.mask_head.encoder.layers.0.linear2.{bias, weight}
  roi_heads.mask_head.encoder.layers.0.norm1.{bias, weight}
  roi_heads.mask_head.encoder.layers.0.norm2.{bias, weight}
  roi_heads.mask_head.encoder.layers.1.self_attn.{in_proj_bias, in_proj_weight}
  roi_heads.mask_head.encoder.layers.1.self_attn.out_proj.{bias, weight}
  roi_heads.mask_head.encoder.layers.1.linear1.{bias, weight}
  roi_heads.mask_head.encoder.layers.1.linear2.{bias, weight}
  roi_heads.mask_head.encoder.layers.1.norm1.{bias, weight}
  roi_heads.mask_head.encoder.layers.1.norm2.{bias, weight}
  roi_heads.mask_head.encoder.layers.2.self_attn.{in_proj_bias, in_proj_weight}
  roi_heads.mask_head.encoder.layers.2.self_attn.out_proj.{bias, weight}
  roi_heads.mask_head.encoder.layers.2.linear1.{bias, weight}
  roi_heads.mask_head.encoder.layers.2.linear2.{bias, weight}
  roi_heads.mask_head.encoder.layers.2.norm1.{bias, weight}
  roi_heads.mask_head.encoder.layers.2.norm2.{bias, weight}
  roi_heads.mask_head.encoder.conv_fuse.{bias, weight}
  roi_heads.mask_head.encoder.conv_r1.0.{bias, weight}
  roi_heads.mask_head.encoder.conv_r1.2.{bias, weight}
  roi_heads.mask_head.mask_fcn_uncertain1.{bias, weight}
  roi_heads.mask_head.mask_fcn_uncertain2.{bias, weight}
  roi_heads.mask_head.mask_fcn_uncertain3.{bias, weight}
  roi_heads.mask_head.mask_fcn_uncertain4.{bias, weight}
  roi_heads.mask_head.deconv_uncertain.{bias, weight}
  roi_heads.mask_head.predictor_uncertain.{bias, weight}
  roi_heads.mask_head.predictor_semantic_s.{bias, weight}

could this be the reason for the difference in results. And in general, the accuracy of masks seem to be not good even with the R101-3x-deform model which has highest mAP compared to the results that I am seeing in the huggingface demo.

Any idea what might be happenning here?

TIA.

lkeab commented 2 years ago

"The checkpoint state_dict contains keys that are not used by the model:..." This suggests that the pretrained model and your mask head file has a dismatch. Can you make sure you install transfiner correctly? I guess you are using the default mask head by mrcnn provided by detectron2.

vanga commented 2 years ago

Thanks. I installed detectron2 using pip first and then installed transfiner by doing python3 setup.py build develop.. Let me follow the instructions in this repo fully and try again (I remember I had some installation issues on the colab which is why I installed detectron2 separately.). Are you aware of any compatability issues with different detectron2 versions?

vanga commented 2 years ago

It indeed seems to be an issue with some dependency compatability issue. Earlier I was installing detectron2 first which is installing a bunch of dependecies unlike the instructions in the readme of this repo.. that somehow led to the issue of model not getting loaded properly. The reason that I was donig that is that the package "omegaconf" is getting installed by the transfiner installation script at a place that is not visible in colab.

Things are working as expected if If I dont' install detectron2 first. And the output masks are much much better :)

I would still like to understand where the problem is exactly, but please feel free to close this issue.

lkeab commented 2 years ago

when you install detectron2 first and using "from detectron2 import xxx", the files of original detectron2 package is invoked instead of our modified mask head file.

vanga commented 2 years ago

Hi @lkeab

While the outputs are much better, I still see these logs now while trying to use the demo.py in Colab. I suppose that this is also not supposed to happen, any idea what might be wrong (note the lines about roi_heads.mask_head*)?

fpn input shapes: {'res2': ShapeSpec(channels=256, height=None, width=None, stride=4), 'res3': ShapeSpec(channels=512, height=None, width=None, stride=8), 'res4': ShapeSpec(channels=1024, height=None, width=None, stride=16), 'res5': ShapeSpec(channels=2048, height=None, width=None, stride=32)}
[06/15 07:28:33 fvcore.common.checkpoint]: [Checkpointer] Loading from ./pretrained_models/output_3x_transfiner_r101_deform.pth ...
WARNING [06/15 07:28:33 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
roi_heads.mask_head.deconv_bo.{bias, weight}
roi_heads.mask_head.predictor_bo.{bias, weight}
  0% 0/1 [00:00<?, ?it/s]/content/transfiner/detectron2/modeling/roi_heads/fast_rcnn.py:154: UserWarning: This overload of nonzero is deprecated:
    nonzero()
Consider using one of the following signatures instead:
    nonzero(*, bool as_tuple) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.)
  filter_inds = filter_mask.nonzero()
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3063: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))

configs/transfiner/mask_rcnn_R_101_FPN_3x_deform.yaml is the config being used. I have followed the instructions from the readme and the python version is also 3.7. Could this still be because of some installation issue?

lkeab commented 2 years ago

Are you using the correct model? "Some model parameters or buffers are not found in the checkpoint:" indicates a mismatch between config file and the model weights.

vanga commented 2 years ago

I downloaded the latest checkpoint from drive and I don't see these warnings any longer. Thanks.

I was using a checkpoint that I downloaded a couple of months back (April).. I am not sure if something changed or If I downloaded it wrong.