How to use the trained VitDet model for inference and visualize the inference results？

I used cascade_mask_rcnn_vitdet_l_100ep.py to train a custom dataset, which can be trained and verified normally, but I can't reason, I didn't find the corresponding yaml configuration file, I only have the config.yaml file saved during training.

When I use DefaultPredictor, I don't have Model.WEIGHTS,INPUT.MIN_SIZE_TEST,DATASETS in my config.yaml,How should I use the trained ViTDet model for inference, or where is the corresponding configuration file for ViTDet for inference.

The commands I use when training are as follows： python ./tools/lazyconfig_train_net_VitDet.py --config-file=./projects/ViTDet/configs/COCO/cascade_mask_rcnn_vitdet_l_100ep.py The command I use when verifying is as follows: python ./tools/lazyconfig_train_net_VitDet.py --config-file=./projects/ViTDet/configs/COCO/cascade_mask_rcnn_vitdet_l_100ep.py --eval-only train.init_checkpoint=./output_L_lr_1e-4/model_final.pth My environments is as follows: The yaml file obtained in training is as follows：

dataloader:
  evaluator: {_target_: detectron2.evaluation.COCOEvaluator, dataset_name: '${..test.dataset.names}'}
  test:
    _target_: detectron2.data.build_detection_test_loader
    dataset: {_target_: detectron2.data.get_detection_dataset_dicts, filter_empty: false, names: coco_2017_val_UTDAC}
    mapper:
      _target_: detectron2.data.DatasetMapper
      augmentations:
      - {_target_: detectron2.data.transforms.ResizeShortestEdge, max_size: 1024, short_edge_length: 1024}
      image_format: ${...train.mapper.image_format}
      is_train: false
    num_workers: 1
  train:
    _target_: detectron2.data.build_detection_train_loader
    dataset: {_target_: detectron2.data.get_detection_dataset_dicts, names: coco_2017_train_UTDAC}
    mapper:
      _target_: detectron2.data.DatasetMapper
      augmentations:
      - {_target_: detectron2.data.transforms.RandomFlip, horizontal: true}
      - {_target_: detectron2.data.transforms.ResizeScale, max_scale: 2.0, min_scale: 0.1, target_height: 1024, target_width: 1024}
      - _target_: detectron2.data.transforms.FixedSizeCrop
        crop_size: [1024, 1024]
        pad: false
      image_format: RGB
      is_train: true
      recompute_boxes: true
      use_instance_mask: true
    num_workers: 1
    total_batch_size: 2
lr_multiplier:
  _target_: detectron2.solver.WarmupParamScheduler
  scheduler:
    _target_: fvcore.common.param_scheduler.MultiStepParamScheduler
    milestones: [229689, 248829]
    num_updates: 258400
    values: [1.0, 0.1, 0.01]
  warmup_factor: 0.001
  warmup_length: 0.0009674922600619195
model:
  _target_: detectron2.modeling.GeneralizedRCNN
  backbone:
    _target_: detectron2.modeling.SimpleFeaturePyramid
    in_feature: ${.net.out_feature}
    net:
      _target_: detectron2.modeling.ViT
      depth: 24
      drop_path_rate: 0.4
      embed_dim: 1024
      img_size: 1024
      mlp_ratio: 4
      norm_layer: !!python/object/apply:functools.partial
        args: [&id001 !!python/name:torch.nn.modules.normalization.LayerNorm '']
        state: !!python/tuple
        - *id001
        - !!python/tuple []
        - {eps: 1.0e-06}
        - null
      num_heads: 16
      out_feature: last_feat
      patch_size: 16
      qkv_bias: true
      residual_block_indexes: []
      use_rel_pos: true
      window_block_indexes: [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22]
      window_size: 14
    norm: LN
    out_channels: 256
    scale_factors: [4.0, 2.0, 1.0, 0.5]
    square_pad: 1024
    top_block: {_target_: detectron2.modeling.backbone.fpn.LastLevelMaxPool}
  input_format: RGB
  pixel_mean: [123.675, 116.28, 103.53]
  pixel_std: [58.395, 57.12, 57.375]
  proposal_generator:
    _target_: detectron2.modeling.proposal_generator.RPN
    anchor_generator:
      _target_: detectron2.modeling.anchor_generator.DefaultAnchorGenerator
      aspect_ratios: [0.5, 1.0, 2.0]
      offset: 0.0
      sizes:
      - [32]
      - [64]
      - [128]
      - [256]
      - [512]
      strides: [4, 8, 16, 32, 64]
    anchor_matcher:
      _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: true
      labels: [0, -1, 1]
      thresholds: [0.3, 0.7]
    batch_size_per_image: 256
    box2box_transform:
      _target_: detectron2.modeling.box_regression.Box2BoxTransform
      weights: [1.0, 1.0, 1.0, 1.0]
    head:
      _target_: detectron2.modeling.proposal_generator.StandardRPNHead
      conv_dims: [-1, -1]
      in_channels: 256
      num_anchors: 3
    in_features: [p2, p3, p4, p5, p6]
    nms_thresh: 0.7
    positive_fraction: 0.5
    post_nms_topk: [1000, 1000]
    pre_nms_topk: [2000, 1000]
  roi_heads:
    _target_: detectron2.modeling.roi_heads.CascadeROIHeads
    batch_size_per_image: 512
    box_heads:
    - _target_: detectron2.modeling.roi_heads.FastRCNNConvFCHead
      conv_dims: [256, 256, 256, 256]
      conv_norm: LN
      fc_dims: [1024]
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 7, stride: null, width: 7}
    - _target_: detectron2.modeling.roi_heads.FastRCNNConvFCHead
      conv_dims: [256, 256, 256, 256]
      conv_norm: LN
      fc_dims: [1024]
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 7, stride: null, width: 7}
    - _target_: detectron2.modeling.roi_heads.FastRCNNConvFCHead
      conv_dims: [256, 256, 256, 256]
      conv_norm: LN
      fc_dims: [1024]
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 7, stride: null, width: 7}
    box_in_features: [p2, p3, p4, p5]
    box_pooler:
      _target_: detectron2.modeling.poolers.ROIPooler
      output_size: 7
      pooler_type: ROIAlignV2
      sampling_ratio: 0
      scales: [0.25, 0.125, 0.0625, 0.03125]
    box_predictors:
    - _target_: detectron2.modeling.FastRCNNOutputLayers
      box2box_transform:
        _target_: detectron2.modeling.box_regression.Box2BoxTransform
        weights: [10, 10, 5, 5]
      cls_agnostic_bbox_reg: true
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 1024, height: null, stride: null, width: null}
      num_classes: ${...num_classes}
      test_score_thresh: 0.05
    - _target_: detectron2.modeling.FastRCNNOutputLayers
      box2box_transform:
        _target_: detectron2.modeling.box_regression.Box2BoxTransform
        weights: [20, 20, 10, 10]
      cls_agnostic_bbox_reg: true
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 1024, height: null, stride: null, width: null}
      num_classes: ${...num_classes}
      test_score_thresh: 0.05
    - _target_: detectron2.modeling.FastRCNNOutputLayers
      box2box_transform:
        _target_: detectron2.modeling.box_regression.Box2BoxTransform
        weights: [30, 30, 15, 15]
      cls_agnostic_bbox_reg: true
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 1024, height: null, stride: null, width: null}
      num_classes: ${...num_classes}
      test_score_thresh: 0.05
    mask_head:
      _target_: detectron2.modeling.roi_heads.MaskRCNNConvUpsampleHead
      conv_dims: [256, 256, 256, 256, 256]
      conv_norm: LN
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 14, stride: null, width: 14}
      num_classes: ${..num_classes}
    mask_in_features: [p2, p3, p4, p5]
    mask_pooler:
      _target_: detectron2.modeling.poolers.ROIPooler
      output_size: 14
      pooler_type: ROIAlignV2
      sampling_ratio: 0
      scales: [0.25, 0.125, 0.0625, 0.03125]
    num_classes: 80
    positive_fraction: 0.25
    proposal_matchers:
    - _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: false
      labels: [0, 1]
      thresholds: [0.5]
    - _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: false
      labels: [0, 1]
      thresholds: [0.6]
    - _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: false
      labels: [0, 1]
      thresholds: [0.7]
optimizer:
  _target_: torch.optim.AdamW
  betas: [0.9, 0.999]
  lr: 0.0001
  params:
    _target_: detectron2.solver.get_default_optimizer_params
    base_lr: ${..lr}
    lr_factor_func: !!python/object/apply:functools.partial
      args: [&id002 !!python/name:detectron2.modeling.backbone.vit.get_vit_lr_decay_rate '']
      state: !!python/tuple
      - *id002
      - !!python/tuple []
      - {lr_decay_rate: 0.8, num_layers: 24}
      - null
    overrides:
      pos_embed: {weight_decay: 0.0}
    weight_decay_norm: 0.0
  weight_decay: 0.1
train:
  amp: {enabled: true}
  checkpointer: {max_to_keep: 100, period: 20000}
  ddp: {broadcast_buffers: false, find_unused_parameters: false, fp16_compression: true}
  device: cuda
  eval_period: 2584
  init_checkpoint: ./output_L_lr_1e-4/model_final.pth
  log_period: 10
  max_iter: 258400
  output_dir: ./output_L_lr_1e-4

You've chosen to report an unexpected problem or bug. Unless you already know the root cause of it, please include details about it by filling the issue template. The following information is missing: "Instructions To Reproduce the Issue and Full Logs";

Full logs or other relevant observations:

[11/04 17:15:33 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='./output_L_lr_1e-4/config.yaml', input=['./Data/UTDAC2020_enhance/val2017/'], opts=['MODEL.WEIGHTS', './output_L_lr_1e-4/model_final.pth'], output='./UTDAC2020_enhance', video_input=None, webcam=False) WARNING [11/04 17:15:33 fvcore.common.config]: Loading config ./output_L_lr_1e-4/config.yaml with yaml.unsafe_load. Your machine may be at risk if the file contains malicious content. Traceback (most recent call last): File "./demo/demo.py", line 100, in <module> cfg = setup_cfg(args) File "./demo/demo.py", line 29, in setup_cfg cfg.merge_from_file(args.config_file) File "/public/home/wangzheng/detectron2/detectron2/config/config.py", line 47, in merge_from_file loaded_cfg = type(self)(loaded_cfg) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 129, in _create_config_tree_from_dict _assert_with_logging( File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 545, in _assert_with_logging assert cond, msg AssertionError: Key model.backbone.net.norm_layer with value <class 'functools.partial'> is not a valid type; valid types: {<class 'float'>, <class 'list'>, <class 'str'>, <class 'bool'>, <class 'NoneType'>, <class 'tuple'>, <class 'int'>}

Hey,I have the same question,how did you sovle it?

Hey,I have the same question,how did you sovle it?

please help us we need to perform inference on images

Anyone find the answer? I'm getting the same error AssertionError: Key model.backbone.net.norm_layer with value <class 'functools.partial'> is not a valid type

嘿，我也有同样的问题，你是怎么解决的？

Traceback (most recent call last): File "demo/demo.py", line 100, in cfg = setup_cfg(args) File "demo/demo.py", line 29, in setup_cfg cfg.merge_from_file(args.config_file) File "/mnt/data1/download_new/EVA/EVA-master-project/EVA-02/det/detectron2/config/config.py", line 47, in merge_from_file loaded_cfg = type(self)(loaded_cfg) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 129, in _create_config_tree_from_dict _assert_with_logging( File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 545, in _assert_with_logging assert cond, msg AssertionError: Key model.backbone.net.norm_layer with value <class 'functools.partial'> is not a valid type; valid types: {<class 'int'>, <class 'list'>, <class 'NoneType'>, <class 'str'>, <class 'tuple'>, <class 'float'>, <class 'bool'>}

I also trained with lazyconfig_train_net.py to get my model_final.pth. But I don't know how to use this to predict an image and display bbox and segment. I am not sure if it is because the configuration file is in py format instead of yaml format. Loading the configuration file using "cfg = LazyConfig.load(config_path)" seems to be problematic.

For example, I can use the following script to make predictions about the test picture: cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") predictor = DefaultPredictor(cfg) outputs = predictor(image) visualizer = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu")) cv2.imshow("Detection Results", out.get_image()[:, :, ::-1])

However, I don't know how to load the configuration file with LazyConfig to achieve the same functionality

facebookresearch / detectron2

How to use the trained VitDet model for inference and visualize the inference results？ #4640