How to use the trained VitDet model for inference and visualize the inference results? #4640

Open ahxiaofengzheng opened 1 year ago

ahxiaofengzheng commented 1 year ago

I used cascade_mask_rcnn_vitdet_l_100ep.py to train a custom dataset, which can be trained and verified normally, but I can't reason, I didn't find the corresponding yaml configuration file, I only have the config.yaml file saved during training.

When I use DefaultPredictor, I don't have Model.WEIGHTS,INPUT.MIN_SIZE_TEST,DATASETS in my config.yaml,How should I use the trained ViTDet model for inference, or where is the corresponding configuration file for ViTDet for inference.

The commands I use when training are as follows: python ./tools/lazyconfig_train_net_VitDet.py --config-file=./projects/ViTDet/configs/COCO/cascade_mask_rcnn_vitdet_l_100ep.py The command I use when verifying is as follows: python ./tools/lazyconfig_train_net_VitDet.py --config-file=./projects/ViTDet/configs/COCO/cascade_mask_rcnn_vitdet_l_100ep.py --eval-only train.init_checkpoint=./output_L_lr_1e-4/model_final.pth My environments is as follows: image image image The yaml file obtained in training is as follows:

  evaluator: {_target_: detectron2.evaluation.COCOEvaluator, dataset_name: '${..test.dataset.names}'}
    _target_: detectron2.data.build_detection_test_loader
    dataset: {_target_: detectron2.data.get_detection_dataset_dicts, filter_empty: false, names: coco_2017_val_UTDAC}
      _target_: detectron2.data.DatasetMapper
      - {_target_: detectron2.data.transforms.ResizeShortestEdge, max_size: 1024, short_edge_length: 1024}
      image_format: ${...train.mapper.image_format}
      is_train: false
    num_workers: 1
    _target_: detectron2.data.build_detection_train_loader
    dataset: {_target_: detectron2.data.get_detection_dataset_dicts, names: coco_2017_train_UTDAC}
      _target_: detectron2.data.DatasetMapper
      - {_target_: detectron2.data.transforms.RandomFlip, horizontal: true}
      - {_target_: detectron2.data.transforms.ResizeScale, max_scale: 2.0, min_scale: 0.1, target_height: 1024, target_width: 1024}
      - _target_: detectron2.data.transforms.FixedSizeCrop
        crop_size: [1024, 1024]
        pad: false
      image_format: RGB
      is_train: true
      recompute_boxes: true
      use_instance_mask: true
    num_workers: 1
    total_batch_size: 2
  _target_: detectron2.solver.WarmupParamScheduler
    _target_: fvcore.common.param_scheduler.MultiStepParamScheduler
    milestones: [229689, 248829]
    num_updates: 258400
    values: [1.0, 0.1, 0.01]
  warmup_factor: 0.001
  warmup_length: 0.0009674922600619195
  _target_: detectron2.modeling.GeneralizedRCNN
    _target_: detectron2.modeling.SimpleFeaturePyramid
    in_feature: ${.net.out_feature}
      _target_: detectron2.modeling.ViT
      depth: 24
      drop_path_rate: 0.4
      embed_dim: 1024
      img_size: 1024
      mlp_ratio: 4
      norm_layer: !!python/object/apply:functools.partial
        args: [&id001 !!python/name:torch.nn.modules.normalization.LayerNorm '']
        state: !!python/tuple
        - *id001
        - !!python/tuple []
        - {eps: 1.0e-06}
        - null
      num_heads: 16
      out_feature: last_feat
      patch_size: 16
      qkv_bias: true
      residual_block_indexes: []
      use_rel_pos: true
      window_block_indexes: [0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22]
      window_size: 14
    norm: LN
    out_channels: 256
    scale_factors: [4.0, 2.0, 1.0, 0.5]
    square_pad: 1024
    top_block: {_target_: detectron2.modeling.backbone.fpn.LastLevelMaxPool}
  input_format: RGB
  pixel_mean: [123.675, 116.28, 103.53]
  pixel_std: [58.395, 57.12, 57.375]
    _target_: detectron2.modeling.proposal_generator.RPN
      _target_: detectron2.modeling.anchor_generator.DefaultAnchorGenerator
      aspect_ratios: [0.5, 1.0, 2.0]
      offset: 0.0
      - [32]
      - [64]
      - [128]
      - [256]
      - [512]
      strides: [4, 8, 16, 32, 64]
      _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: true
      labels: [0, -1, 1]
      thresholds: [0.3, 0.7]
    batch_size_per_image: 256
      _target_: detectron2.modeling.box_regression.Box2BoxTransform
      weights: [1.0, 1.0, 1.0, 1.0]
      _target_: detectron2.modeling.proposal_generator.StandardRPNHead
      conv_dims: [-1, -1]
      in_channels: 256
      num_anchors: 3
    in_features: [p2, p3, p4, p5, p6]
    nms_thresh: 0.7
    positive_fraction: 0.5
    post_nms_topk: [1000, 1000]
    pre_nms_topk: [2000, 1000]
    _target_: detectron2.modeling.roi_heads.CascadeROIHeads
    batch_size_per_image: 512
    - _target_: detectron2.modeling.roi_heads.FastRCNNConvFCHead
      conv_dims: [256, 256, 256, 256]
      conv_norm: LN
      fc_dims: [1024]
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 7, stride: null, width: 7}
    - _target_: detectron2.modeling.roi_heads.FastRCNNConvFCHead
      conv_dims: [256, 256, 256, 256]
      conv_norm: LN
      fc_dims: [1024]
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 7, stride: null, width: 7}
    - _target_: detectron2.modeling.roi_heads.FastRCNNConvFCHead
      conv_dims: [256, 256, 256, 256]
      conv_norm: LN
      fc_dims: [1024]
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 7, stride: null, width: 7}
    box_in_features: [p2, p3, p4, p5]
      _target_: detectron2.modeling.poolers.ROIPooler
      output_size: 7
      pooler_type: ROIAlignV2
      sampling_ratio: 0
      scales: [0.25, 0.125, 0.0625, 0.03125]
    - _target_: detectron2.modeling.FastRCNNOutputLayers
        _target_: detectron2.modeling.box_regression.Box2BoxTransform
        weights: [10, 10, 5, 5]
      cls_agnostic_bbox_reg: true
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 1024, height: null, stride: null, width: null}
      num_classes: ${...num_classes}
      test_score_thresh: 0.05
    - _target_: detectron2.modeling.FastRCNNOutputLayers
        _target_: detectron2.modeling.box_regression.Box2BoxTransform
        weights: [20, 20, 10, 10]
      cls_agnostic_bbox_reg: true
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 1024, height: null, stride: null, width: null}
      num_classes: ${...num_classes}
      test_score_thresh: 0.05
    - _target_: detectron2.modeling.FastRCNNOutputLayers
        _target_: detectron2.modeling.box_regression.Box2BoxTransform
        weights: [30, 30, 15, 15]
      cls_agnostic_bbox_reg: true
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 1024, height: null, stride: null, width: null}
      num_classes: ${...num_classes}
      test_score_thresh: 0.05
      _target_: detectron2.modeling.roi_heads.MaskRCNNConvUpsampleHead
      conv_dims: [256, 256, 256, 256, 256]
      conv_norm: LN
      input_shape: !!python/object:detectron2.layers.shape_spec.ShapeSpec {channels: 256, height: 14, stride: null, width: 14}
      num_classes: ${..num_classes}
    mask_in_features: [p2, p3, p4, p5]
      _target_: detectron2.modeling.poolers.ROIPooler
      output_size: 14
      pooler_type: ROIAlignV2
      sampling_ratio: 0
      scales: [0.25, 0.125, 0.0625, 0.03125]
    num_classes: 80
    positive_fraction: 0.25
    - _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: false
      labels: [0, 1]
      thresholds: [0.5]
    - _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: false
      labels: [0, 1]
      thresholds: [0.6]
    - _target_: detectron2.modeling.matcher.Matcher
      allow_low_quality_matches: false
      labels: [0, 1]
      thresholds: [0.7]
  _target_: torch.optim.AdamW
  betas: [0.9, 0.999]
  lr: 0.0001
    _target_: detectron2.solver.get_default_optimizer_params
    base_lr: ${..lr}
    lr_factor_func: !!python/object/apply:functools.partial
      args: [&id002 !!python/name:detectron2.modeling.backbone.vit.get_vit_lr_decay_rate '']
      state: !!python/tuple
      - *id002
      - !!python/tuple []
      - {lr_decay_rate: 0.8, num_layers: 24}
      - null
      pos_embed: {weight_decay: 0.0}
    weight_decay_norm: 0.0
  weight_decay: 0.1
  amp: {enabled: true}
  checkpointer: {max_to_keep: 100, period: 20000}
  ddp: {broadcast_buffers: false, find_unused_parameters: false, fp16_compression: true}
  device: cuda
  eval_period: 2584
  init_checkpoint: ./output_L_lr_1e-4/model_final.pth
  log_period: 10
  max_iter: 258400
  output_dir: ./output_L_lr_1e-4
ahxiaofengzheng commented 1 year ago

Full logs or other relevant observations:

[11/04 17:15:33 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='./output_L_lr_1e-4/config.yaml', input=['./Data/UTDAC2020_enhance/val2017/'], opts=['MODEL.WEIGHTS', './output_L_lr_1e-4/model_final.pth'], output='./UTDAC2020_enhance', video_input=None, webcam=False) WARNING [11/04 17:15:33 fvcore.common.config]: Loading config ./output_L_lr_1e-4/config.yaml with yaml.unsafe_load. Your machine may be at risk if the file contains malicious content. Traceback (most recent call last): File "./demo/demo.py", line 100, in <module> cfg = setup_cfg(args) File "./demo/demo.py", line 29, in setup_cfg cfg.merge_from_file(args.config_file) File "/public/home/wangzheng/detectron2/detectron2/config/config.py", line 47, in merge_from_file loaded_cfg = type(self)(loaded_cfg) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 86, in __init__ init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 129, in _create_config_tree_from_dict _assert_with_logging( File "/public/home/wangzheng/.conda/envs/detectron2/lib/python3.8/site-packages/yacs/config.py", line 545, in _assert_with_logging assert cond, msg AssertionError: Key model.backbone.net.norm_layer with value <class 'functools.partial'> is not a valid type; valid types: {<class 'float'>, <class 'list'>, <class 'str'>, <class 'bool'>, <class 'NoneType'>, <class 'tuple'>, <class 'int'>}

QuanLNTU commented 1 year ago

Hey,I have the same question,how did you sovle it?

mohamedettebayo commented 1 year ago

Hey,I have the same question,how did you sovle it?

please help us we need to perform inference on images

CA4GitHub commented 8 months ago

Anyone find the answer? I'm getting the same error AssertionError: Key model.backbone.net.norm_layer with value <class 'functools.partial'> is not a valid type

xinlin-xiao commented 7 months ago

Traceback (most recent call last): File "demo/demo.py", line 100, in cfg = setup_cfg(args) File "demo/demo.py", line 29, in setup_cfg cfg.merge_from_file(args.config_file) File "/mnt/data1/download_new/EVA/EVA-master-project/EVA-02/det/detectron2/config/config.py", line 47, in merge_from_file loaded_cfg = type(self)(loaded_cfg) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 126, in _create_config_tree_from_dict dic[k] = cls(v, key_list=key_list + [k]) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 86, in init init_dict = self._create_config_tree_from_dict(init_dict, key_list) File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 129, in _create_config_tree_from_dict _assert_with_logging( File "/usr/local/lib/python3.8/dist-packages/yacs/config.py", line 545, in _assert_with_logging assert cond, msg AssertionError: Key model.backbone.net.norm_layer with value <class 'functools.partial'> is not a valid type; valid types: {<class 'int'>, <class 'list'>, <class 'NoneType'>, <class 'str'>, <class 'tuple'>, <class 'float'>, <class 'bool'>}

dayong233 commented 2 months ago

I also trained with lazyconfig_train_net.py to get my model_final.pth. But I don't know how to use this to predict an image and display bbox and segment. I am not sure if it is because the configuration file is in py format instead of yaml format. Loading the configuration file using "cfg = LazyConfig.load(config_path)" seems to be problematic.

For example, I can use the following script to make predictions about the test picture: cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") predictor = DefaultPredictor(cfg) outputs = predictor(image) visualizer = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = visualizer.draw_instance_predictions(outputs["instances"].to("cpu")) cv2.imshow("Detection Results", out.get_image()[:, :, ::-1])

However, I don't know how to load the configuration file with LazyConfig to achieve the same functionality