tensorflow / models

Models and examples built with TensorFlow
Other
77.18k stars 45.76k forks source link

DeepMAC ValueError: Tensor's shape (3, 3, 256, 256) is not compatible with supplied shape (1, 1, 256, 6) #9947

Closed anshkumar closed 3 years ago

anshkumar commented 3 years ago

I'm training on custom dataset using DeepMAC. The config is as follows:

# DeepMAC meta architecture from the "The surprising impact of mask-head
# architecture on novel class segmentation" [1] paper with an Hourglass-100[2]
# mask head. This config is trained on all COCO classes and achieves a
# mask mAP of 39.4% on the COCO testdev-2017 set.
# [1]: https://arxiv.org/abs/2104.00613
# [2]: https://arxiv.org/abs/1904.07850

# Train on TPU-128

model {
  center_net {
    num_classes: 6
    feature_extractor {
      type: "hourglass_104"
      bgr_ordering: true
      channel_means: [104.01362025, 114.03422265, 119.9165958 ]
      channel_stds: [73.6027665 , 69.89082075, 70.9150767 ]
    }
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 1024
        max_dimension: 1024
        pad_to_max_dimension: true
      }
    }
    object_detection_task {
      task_loss_weight: 1.0
      offset_loss_weight: 1.0
      scale_loss_weight: 0.1
      localization_loss {
        l1_localization_loss {
        }
      }
    }
    object_center_params {
      object_center_loss_weight: 1.0
      min_box_overlap_iou: 0.7
      max_box_predictions: 2000
      classification_loss {
        penalty_reduced_logistic_focal_loss {
          alpha: 2.0
          beta: 4.0
        }
      }
    }

    deepmac_mask_estimation {
      dim: 32
      task_loss_weight: 5.0
      pixel_embedding_dim: 16
      mask_size: 32
      use_xy: true
      use_instance_embedding: true
      network_type: "hourglass100"
      classification_loss {
        weighted_sigmoid {}
      }
    }
  }
}

train_config: {

  batch_size: 2
  num_steps: 50000

  data_augmentation_options {
    random_horizontal_flip {
    }
  }

  data_augmentation_options {
    random_adjust_hue {
    }
  }

  data_augmentation_options {
    random_adjust_contrast {
    }
  }

  data_augmentation_options {
    random_adjust_saturation {
    }
  }

  data_augmentation_options {
    random_adjust_brightness {
    }
  }

   data_augmentation_options {
     random_square_crop_by_scale {
      scale_min: 0.6
      scale_max: 1.3
    }
  }

  optimizer {
    adam_optimizer: {
      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 1e-3
          total_steps: 50000
          warmup_learning_rate: 2.5e-4
          warmup_steps: 5000
        }
      }
    }
    use_moving_average: false
  }
  max_number_of_boxes: 2000
  unpad_groundtruth_tensors: false

  #fine_tune_checkpoint_version: V2
  #fine_tune_checkpoint: "/home/deploy/ved/deepmac_1024x1024_coco17/checkpoint/ckpt-0"
  #fine_tune_checkpoint_type: "detection"
}

train_input_reader: {
  load_instance_masks: true
  label_map_path: "/home/deploy/ved/label_map_rice_l1.pbtxt"
  mask_type: PNG_MASKS
  tf_record_input_reader {
    input_path: "/home/deploy/ved/rice_l1.record"
  }
}

eval_config: {
  metrics_set: "coco_detection_metrics"
  metrics_set: "coco_mask_metrics"
  include_metrics_per_category: true
  use_moving_averages: false
  batch_size: 1;
}

eval_input_reader: {
  load_instance_masks: true
  mask_type: PNG_MASKS
  label_map_path: "/home/deploy/ved/label_map_rice_l1.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "/home/deploy/ved/rice_l1_val.record"
  }
}

But I'm getting following error:

INFO:tensorflow:Error reported to Coordinator: in user code:

    /home/deploy/models/research/object_detection/model_lib_v2.py:616 train_step_fn  *
        loss = eager_train_step(
    /home/deploy/models/research/object_detection/model_lib_v2.py:289 eager_train_step  *
        losses_dict, _ = _compute_losses_and_predictions_dicts(
    /home/deploy/models/research/object_detection/model_lib_v2.py:118 _compute_losses_and_predictions_dicts  *
        prediction_dict = model.predict(
    /home/deploy/models/research/object_detection/meta_architectures/center_net_meta_arch.py:3294 predict  *
        predictions[head_name] = [
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:389 call
        outputs = layer(inputs, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1008 __call__
        self._maybe_build(inputs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:2710 _maybe_build
        self.build(input_shapes)  # pylint:disable=not-callable
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py:205 build
        dtype=self.dtype)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:639 add_weight
        caching_device=caching_device)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:810 _add_variable_with_custom_getter
        **kwargs_for_getter)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py:142 make_variable
        shape=variable_shape if variable_shape else None)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:260 __call__
        return cls._variable_v1_call(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:221 _variable_v1_call
        shape=shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/shared_variable_creator.py:69 create_new_variable
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:2083 creator_with_resource_vars
        created = self._create_variable(next_creator, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:489 _create_variable
        distribute_utils.VARIABLE_POLICY_MAPPING, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_utils.py:311 create_mirrored_variable
        value_list = real_mirrored_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:481 _real_mirrored_creator
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:714 variable_capturing_scope
        lifted_initializer_graph=lifted_initializer_graph, **kwds)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:264 __call__
        return super(VariableMetaclass, cls).__call__(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:227 __init__
        initial_value = initial_value()
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:82 __call__
        self._checkpoint_position, shape, shard_info=shard_info)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:117 __init__
        self.wrapped_value.set_shape(shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1217 set_shape
        (self.shape, shape))

    ValueError: Tensor's shape (3, 3, 256, 256) is not compatible with supplied shape (1, 1, 256, 6)
Traceback (most recent call last):
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception
    yield
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_run.py", line 323, in run
    self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 670, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /home/deploy/models/research/object_detection/model_lib_v2.py:616 train_step_fn  *
        loss = eager_train_step(
    /home/deploy/models/research/object_detection/model_lib_v2.py:289 eager_train_step  *
        losses_dict, _ = _compute_losses_and_predictions_dicts(
    /home/deploy/models/research/object_detection/model_lib_v2.py:118 _compute_losses_and_predictions_dicts  *
        prediction_dict = model.predict(
    /home/deploy/models/research/object_detection/meta_architectures/center_net_meta_arch.py:3294 predict  *
        predictions[head_name] = [
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:389 call
        outputs = layer(inputs, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1008 __call__
        self._maybe_build(inputs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:2710 _maybe_build
        self.build(input_shapes)  # pylint:disable=not-callable
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py:205 build
        dtype=self.dtype)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:639 add_weight
        caching_device=caching_device)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:810 _add_variable_with_custom_getter
        **kwargs_for_getter)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py:142 make_variable
        shape=variable_shape if variable_shape else None)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:260 __call__
        return cls._variable_v1_call(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:221 _variable_v1_call
        shape=shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/shared_variable_creator.py:69 create_new_variable
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:2083 creator_with_resource_vars
        created = self._create_variable(next_creator, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:489 _create_variable
        distribute_utils.VARIABLE_POLICY_MAPPING, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_utils.py:311 create_mirrored_variable
        value_list = real_mirrored_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:481 _real_mirrored_creator
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:714 variable_capturing_scope
        lifted_initializer_graph=lifted_initializer_graph, **kwds)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:264 __call__
        return super(VariableMetaclass, cls).__call__(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:227 __init__
        initial_value = initial_value()
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:82 __call__
        self._checkpoint_position, shape, shard_info=shard_info)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:117 __init__
        self.wrapped_value.set_shape(shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1217 set_shape
        (self.shape, shape))

    ValueError: Tensor's shape (3, 3, 256, 256) is not compatible with supplied shape (1, 1, 256, 6)

I0426 06:09:32.856501 140009915004672 coordinator.py:219] Error reported to Coordinator: in user code:

    /home/deploy/models/research/object_detection/model_lib_v2.py:616 train_step_fn  *
        loss = eager_train_step(
    /home/deploy/models/research/object_detection/model_lib_v2.py:289 eager_train_step  *
        losses_dict, _ = _compute_losses_and_predictions_dicts(
    /home/deploy/models/research/object_detection/model_lib_v2.py:118 _compute_losses_and_predictions_dicts  *
        prediction_dict = model.predict(
    /home/deploy/models/research/object_detection/meta_architectures/center_net_meta_arch.py:3294 predict  *
        predictions[head_name] = [
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:389 call
        outputs = layer(inputs, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1008 __call__
        self._maybe_build(inputs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:2710 _maybe_build
        self.build(input_shapes)  # pylint:disable=not-callable
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py:205 build
        dtype=self.dtype)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:639 add_weight
        caching_device=caching_device)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:810 _add_variable_with_custom_getter
        **kwargs_for_getter)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py:142 make_variable
        shape=variable_shape if variable_shape else None)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:260 __call__
        return cls._variable_v1_call(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:221 _variable_v1_call
        shape=shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/shared_variable_creator.py:69 create_new_variable
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:2083 creator_with_resource_vars
        created = self._create_variable(next_creator, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:489 _create_variable
        distribute_utils.VARIABLE_POLICY_MAPPING, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_utils.py:311 create_mirrored_variable
        value_list = real_mirrored_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:481 _real_mirrored_creator
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:714 variable_capturing_scope
        lifted_initializer_graph=lifted_initializer_graph, **kwds)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:264 __call__
        return super(VariableMetaclass, cls).__call__(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:227 __init__
        initial_value = initial_value()
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:82 __call__
        self._checkpoint_position, shape, shard_info=shard_info)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:117 __init__
        self.wrapped_value.set_shape(shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1217 set_shape
        (self.shape, shape))

    ValueError: Tensor's shape (3, 3, 256, 256) is not compatible with supplied shape (1, 1, 256, 6)
Traceback (most recent call last):
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception
    yield
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_run.py", line 323, in run
    self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 670, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /home/deploy/models/research/object_detection/model_lib_v2.py:616 train_step_fn  *
        loss = eager_train_step(
    /home/deploy/models/research/object_detection/model_lib_v2.py:289 eager_train_step  *
        losses_dict, _ = _compute_losses_and_predictions_dicts(
    /home/deploy/models/research/object_detection/model_lib_v2.py:118 _compute_losses_and_predictions_dicts  *
        prediction_dict = model.predict(
    /home/deploy/models/research/object_detection/meta_architectures/center_net_meta_arch.py:3294 predict  *
        predictions[head_name] = [
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:389 call
        outputs = layer(inputs, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1008 __call__
        self._maybe_build(inputs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:2710 _maybe_build
        self.build(input_shapes)  # pylint:disable=not-callable
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py:205 build
        dtype=self.dtype)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:639 add_weight
        caching_device=caching_device)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:810 _add_variable_with_custom_getter
        **kwargs_for_getter)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py:142 make_variable
        shape=variable_shape if variable_shape else None)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:260 __call__
        return cls._variable_v1_call(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:221 _variable_v1_call
        shape=shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/shared_variable_creator.py:69 create_new_variable
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:2083 creator_with_resource_vars
        created = self._create_variable(next_creator, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:489 _create_variable
        distribute_utils.VARIABLE_POLICY_MAPPING, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_utils.py:311 create_mirrored_variable
        value_list = real_mirrored_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:481 _real_mirrored_creator
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:714 variable_capturing_scope
        lifted_initializer_graph=lifted_initializer_graph, **kwds)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:264 __call__
        return super(VariableMetaclass, cls).__call__(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:227 __init__
        initial_value = initial_value()
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:82 __call__
        self._checkpoint_position, shape, shard_info=shard_info)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:117 __init__
        self.wrapped_value.set_shape(shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1217 set_shape
        (self.shape, shape))

    ValueError: Tensor's shape (3, 3, 256, 256) is not compatible with supplied shape (1, 1, 256, 6)

Traceback (most recent call last):
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 113, in <module>
    tf.compat.v1.app.run()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 110, in main
    record_summaries=FLAGS.record_summaries)
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 667, in train_loop
    loss = _dist_train_step(train_input_iter)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 871, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 726, in _initialize
    *args, **kwds))
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2969, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /home/deploy/models/research/object_detection/model_lib_v2.py:653 _dist_train_step  *
        return _sample_and_train(strategy, train_step_fn, data_iterator)
    /home/deploy/models/research/object_detection/model_lib_v2.py:633 _sample_and_train  *
        per_replica_losses = strategy.run(
    /home/deploy/models/research/object_detection/model_lib_v2.py:616 train_step_fn  *
        loss = eager_train_step(
    /home/deploy/models/research/object_detection/model_lib_v2.py:289 eager_train_step  *
        losses_dict, _ = _compute_losses_and_predictions_dicts(
    /home/deploy/models/research/object_detection/model_lib_v2.py:118 _compute_losses_and_predictions_dicts  *
        prediction_dict = model.predict(
    /home/deploy/models/research/object_detection/meta_architectures/center_net_meta_arch.py:3294 predict  *
        predictions[head_name] = [
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:389 call
        outputs = layer(inputs, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1008 __call__
        self._maybe_build(inputs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:2710 _maybe_build
        self.build(input_shapes)  # pylint:disable=not-callable
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py:205 build
        dtype=self.dtype)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:639 add_weight
        caching_device=caching_device)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:810 _add_variable_with_custom_getter
        **kwargs_for_getter)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py:142 make_variable
        shape=variable_shape if variable_shape else None)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:260 __call__
        return cls._variable_v1_call(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:221 _variable_v1_call
        shape=shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/shared_variable_creator.py:69 create_new_variable
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:2083 creator_with_resource_vars
        created = self._create_variable(next_creator, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:489 _create_variable
        distribute_utils.VARIABLE_POLICY_MAPPING, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_utils.py:311 create_mirrored_variable
        value_list = real_mirrored_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py:481 _real_mirrored_creator
        v = next_creator(**kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:67 getter
        return captured_getter(captured_previous, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:714 variable_capturing_scope
        lifted_initializer_graph=lifted_initializer_graph, **kwds)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:264 __call__
        return super(VariableMetaclass, cls).__call__(*args, **kwargs)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:227 __init__
        initial_value = initial_value()
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:82 __call__
        self._checkpoint_position, shape, shard_info=shard_info)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py:117 __init__
        self.wrapped_value.set_shape(shape)
    /home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1217 set_shape
        (self.shape, shape))

    ValueError: Tensor's shape (3, 3, 256, 256) is not compatible with supplied shape (1, 1, 256, 6)
vighneshbirodkar commented 3 years ago

This is strange. Can you

  1. Tell us what TF version you are using
  2. Ensure that model_dir is empty before you start training and report what happens ?
anshkumar commented 3 years ago

1) tf version is 2.4.1 2) After clearing model_dir, it working. But when specifying fine_tune_checkpoint_type as "detection" or "fine_tune", it's failing.

vighneshbirodkar commented 3 years ago

Can you cleanup the directory, run with "fine_tune" and tell us what error you are getting ?

anshkumar commented 3 years ago

After clearing the model_dir (only having ckpt-0.data-00000-of-00001 and ckpt-0.index), "fine_tune" option is working fine. But when using "detection" I''m getting a long list of error.

vighneshbirodkar commented 3 years ago

model_dir should be completely empty before starting the training job for the first time. And fine tune checkpoint should be stored in a different directory which is not model_dir.

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_training_and_evaluation.md#recommended-directory-structure-for-training-and-evaluation The page above explains this. Note that everything in model dir is created by the training/evaluation jobs.

anshkumar commented 3 years ago

Even after doing that I'm getting following error for "detection":

Traceback (most recent call last):
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 113, in <module>
    tf.compat.v1.app.run()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 110, in main
    record_summaries=FLAGS.record_summaries)
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 597, in train_loop
    train_input, unpad_groundtruth_tensors)
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 398, in load_fine_tune_checkpoint
    ckpt.restore(checkpoint_path).assert_existing_objects_matched()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 810, in assert_existing_objects_matched
    (list(unused_python_objects),))
AssertionError: Some Python objects were not bound to checkpointed values, likely due to changes in the Python program: [MirroredVariable:{
  0: <tf.Variable 'center_net_hourglass_feature_extractor/hourglass_network/encoder_decoder_block_5/encoder_decoder_block_6/encoder_decoder_block_7/encoder_decoder_block_8/encoder_decoder_block_9/residual_block_58/convolutional_block_60/batchnorm/gamma:0' shape=(512,) dtype=float32, numpy=
...

Here is the complete error.

vighneshbirodkar commented 3 years ago

"detection" is not designed to work with this use case. "detection" with Centernet is only currently supported from the extreme net checkpoint in the model zoo.

Since "fine_tune" is working, I am closing this bug because that is the intended behavior.

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No

vighneshbirodkar commented 3 years ago

Please note that this commit https://github.com/tensorflow/models/commit/aa3e639f80c2967504310b0f578f0f00063a8aff

Consolidates "fine_tune" and "detection" types into just "detection". Now all TF2 models only support 3 different types "detection", "classification" and "full"

anshkumar commented 3 years ago

I tried using "detection" again with the latest pull, but with the pre-train checkpoints I'm getting following error:

Traceback (most recent call last):
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 113, in <module>
    tf.compat.v1.app.run()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 110, in main
    record_summaries=FLAGS.record_summaries)
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 598, in train_loop
    train_input, unpad_groundtruth_tensors)
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 400, in load_fine_tune_checkpoint
    ckpt.restore(checkpoint_path).assert_existing_objects_matched()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 1776, in restore
    status = self._saver.restore(save_path=save_path)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 1339, in restore
    checkpoint=checkpoint, proto_id=0).restore(self._graph_view.root)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 258, in restore
    restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 978, in _restore_from_checkpoint_position
    tensor_saveables, python_saveables))
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 309, in restore_saveables
    validated_saveables).restore(self.save_path_tensor, self.options)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saving/functional_saver.py", line 339, in restore
    restore_ops = restore_fn()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saving/functional_saver.py", line 323, in restore_fn
    restore_ops.update(saver.restore(file_prefix, options))
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saving/functional_saver.py", line 116, in restore
    restored_tensors, restored_shapes=None)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/values.py", line 1079, in restore
    tensor)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/values_util.py", line 96, in get_on_write_restore_ops
    for v in var.values))
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/values_util.py", line 96, in <genexpr>
    for v in var.values))
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/distribute/values_util.py", line 302, in assign_on_device
    return variable.assign(tensor)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 901, in assign
    (tensor_name, self._shape, value_tensor.shape))
ValueError: Cannot assign to variable center_net_hourglass_feature_extractor/hourglass_network/input_downsample_block/convolutional_block/conv2d/kernel:0 due to variable shape (7, 7, 3, 64) and value shape (7, 7, 3, 128) are incompatible

Will the pre-train only work with 1024x1024 and "Hourglass-100" mask head ?

vighneshbirodkar commented 3 years ago

Can you tell us which checkpoint you are trying this with ? And also share the full stack trace of the error log and the config file.

@anshkumar Pre-training should work with all mask heads.

anshkumar commented 3 years ago

I'm using the checkpoints provided in the document here. Here is my config:

# DeepMAC meta architecture from the "The surprising impact of mask-head
# architecture on novel class segmentation" [1] paper with an Hourglass-100[2]
# mask head. This config is trained on all COCO classes and achieves a
# mask mAP of 39.4% on the COCO testdev-2017 set.
# [1]: https://arxiv.org/abs/2104.00613
# [2]: https://arxiv.org/abs/1904.07850

# Train on TPU-128

model {
  center_net {
    num_classes: 5
    feature_extractor {
      type: "hourglass_52"
      bgr_ordering: true
      channel_means: [104.01362025, 114.03422265, 119.9165958 ]
      channel_stds: [73.6027665 , 69.89082075, 70.9150767 ]
    }
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 768
        max_dimension: 768
        pad_to_max_dimension: true
      }
    }
    object_detection_task {
      task_loss_weight: 1.0
      offset_loss_weight: 1.0
      scale_loss_weight: 0.1
      localization_loss {
        l1_localization_loss {
        }
      }
    }
    object_center_params {
      object_center_loss_weight: 1.0
      min_box_overlap_iou: 0.7
      max_box_predictions: 2000
      classification_loss {
        penalty_reduced_logistic_focal_loss {
          alpha: 2.0
          beta: 4.0
        }
      }
    }

    deepmac_mask_estimation {
      dim: 32
      task_loss_weight: 5.0
      pixel_embedding_dim: 16
      mask_size: 32
      use_xy: true
      use_instance_embedding: true
      network_type: "hourglass20"
      classification_loss {
        weighted_sigmoid {}
      }
    }
  }
}

train_config: {

  batch_size: 4
  num_steps: 50000

  data_augmentation_options {
    random_horizontal_flip {
    }
  }

  data_augmentation_options {
    random_adjust_hue {
    }
  }

  data_augmentation_options {
    random_adjust_contrast {
    }
  }

  data_augmentation_options {
    random_adjust_saturation {
    }
  }

  data_augmentation_options {
    random_adjust_brightness {
    }
  }

   #data_augmentation_options {
   #  random_square_crop_by_scale {
   #   scale_min: 0.6
   #   scale_max: 1.3
   # }
   # }

  optimizer {
    adam_optimizer: {
      epsilon: 1e-7  # Match tf.keras.optimizers.Adam's default.
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 1e-3
          total_steps: 50000
          warmup_learning_rate: 2.5e-4
          warmup_steps: 5000
        }
      }
    }
    use_moving_average: false
  }
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false

  fine_tune_checkpoint_version: V2
  fine_tune_checkpoint: "/home/deploy/ved/deepmac_1024x1024_coco17/pre-train/ckpt-0"
  fine_tune_checkpoint_type: "detection"
}

train_input_reader: {
  load_instance_masks: true
  label_map_path: "/home/deploy/ved/pfg/l2/sort/label_map_potato_l2.pbtxt"
  mask_type: PNG_MASKS
  tf_record_input_reader {
    input_path: "/home/deploy/ved/pfg/l2/sort/sort_train.record"
  }
}

eval_config: {
  metrics_set: "coco_detection_metrics"
  metrics_set: "coco_mask_metrics"
  include_metrics_per_category: true
  use_moving_averages: false
  batch_size: 1;
}

eval_input_reader: {
  load_instance_masks: true
  mask_type: PNG_MASKS
  label_map_path: "/home/deploy/ved/pfg/l2/sort/label_map_potato_l2.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "/home/deploy/ved/pfg/l2/sort/sort_val.record"
  }
}

Here is the full log.

Also, with this config I'm not able to get any mask loss in the tensorboard (when trained without any pre-train checkpoints).

vighneshbirodkar commented 3 years ago

This is happening because you are using the hourglass52 feature extractor.

Hourglass-52 uses 64 channels in its first layers https://github.com/tensorflow/models/blob/master/research/object_detection/models/center_net_hourglass_feature_extractor.py#L100

Where as hourglass104 uses 128 channels https://github.com/tensorflow/models/blob/master/research/object_detection/models/center_net_hourglass_feature_extractor.py#L106

We don't support altering the number of channels right now. My suggestion would be to try and use the hourglass104 feature extractor.

anshkumar commented 3 years ago

Thanks for the clarification. But why am I not getting mask loss ? Also, during validation it was showing key error of "detection_masks". Here is a temporary tensorboard.

vighneshbirodkar commented 3 years ago

Oh, that might be a bug. Let me investigate.

vighneshbirodkar commented 3 years ago

@anshkumar This commit should fix the issue. https://github.com/tensorflow/models/commit/8b45de4ffc7eb8d66f0139ee1f62e699ee401072

anshkumar commented 3 years ago

@vighneshbirodkar it's missing an import:

from object_detection.meta_architectures import deepmac_meta_arch
vighneshbirodkar commented 3 years ago

Fixed via https://github.com/tensorflow/models/commit/441f14a6aac221406aeb98c96df3ef3d0c3752f9

I also added a test.

anshkumar commented 3 years ago

@vighneshbirodkar during validation, I'm getting following errror:

Traceback (most recent call last):  
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 113, in <module>
    tf.compat.v1.app.run()
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/home/deploy/miniconda3/envs/tensorflow/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/home/deploy/models/research/object_detection/model_main_tf2.py", line 88, in main
    wait_interval=300, timeout=FLAGS.eval_timeout)
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 1139, in eval_continuously
    global_step=global_step,
  File "/home/deploy/models/research/object_detection/model_lib_v2.py", line 984, in eager_eval_loop
    eval_metrics.update(evaluator.evaluate())
  File "/home/deploy/models/research/object_detection/metrics/coco_evaluation.py", line 307, in evaluate
    super_categories=self._super_categories)
  File "/home/deploy/models/research/object_detection/metrics/coco_tools.py", line 305, in ComputeMetrics
    raise ValueError('Category stats do not exist')
ValueError: Category stats do not exist