tensorflow / models

Models and examples built with TensorFlow
Other
77.17k stars 45.76k forks source link

TypeError: 'NoneType' object is not iterable while performing model evaluation. #10833

Closed gokulh12 closed 1 year ago

gokulh12 commented 1 year ago

Hi, I am facing an error while trying to perform model evaluation after training. TypeError: 'NoneType' object is not iterable

I am using the below code to get the model evaluation:

!python /content/gdrive/MyDrive/content/models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={pipeline_file} \
    --model_dir={model_dir} \
    --checkpoint_dir={model_dir}

These are the paths:

pipeline_file = '/content/gdrive/MyDrive/content/models/research/deploy/pipeline_file.config'
model_dir = '/content/gdrive/MyDrive/content/log_files_barca_bayern'

Log while running the evaluation code:

2022-11-13 16:36:37.304526: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-11-13 16:36:38.053407: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2022-11-13 16:36:38.053514: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2022-11-13 16:36:38.053532: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W1113 16:36:40.238175 139810192238464 model_lib_v2.py:1090] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: None
I1113 16:36:40.238409 139810192238464 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I1113 16:36:40.238500 139810192238464 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I1113 16:36:40.238580 139810192238464 config_util.py:552] Maybe overwriting eval_num_epochs: 1
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1.
W1113 16:36:40.238686 139810192238464 model_lib_v2.py:1110] Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1.
2022-11-13 16:36:41.088505: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
I1113 16:36:41.109091 139810192238464 ssd_efficientnet_bifpn_feature_extractor.py:146] EfficientDet EfficientNet backbone version: efficientnet-b0
I1113 16:36:41.109263 139810192238464 ssd_efficientnet_bifpn_feature_extractor.py:147] EfficientDet BiFPN num filters: 64
I1113 16:36:41.109331 139810192238464 ssd_efficientnet_bifpn_feature_extractor.py:149] EfficientDet BiFPN num iterations: 3
I1113 16:36:41.112856 139810192238464 efficientnet_model.py:143] round_filter input=32 output=32
I1113 16:36:41.145924 139810192238464 efficientnet_model.py:143] round_filter input=32 output=32
I1113 16:36:41.146051 139810192238464 efficientnet_model.py:143] round_filter input=16 output=16
I1113 16:36:41.218508 139810192238464 efficientnet_model.py:143] round_filter input=16 output=16
I1113 16:36:41.218694 139810192238464 efficientnet_model.py:143] round_filter input=24 output=24
I1113 16:36:41.404295 139810192238464 efficientnet_model.py:143] round_filter input=24 output=24
I1113 16:36:41.404441 139810192238464 efficientnet_model.py:143] round_filter input=40 output=40
I1113 16:36:41.577770 139810192238464 efficientnet_model.py:143] round_filter input=40 output=40
I1113 16:36:41.577946 139810192238464 efficientnet_model.py:143] round_filter input=80 output=80
I1113 16:36:41.833776 139810192238464 efficientnet_model.py:143] round_filter input=80 output=80
I1113 16:36:41.833942 139810192238464 efficientnet_model.py:143] round_filter input=112 output=112
I1113 16:36:42.104938 139810192238464 efficientnet_model.py:143] round_filter input=112 output=112
I1113 16:36:42.105093 139810192238464 efficientnet_model.py:143] round_filter input=192 output=192
I1113 16:36:42.436462 139810192238464 efficientnet_model.py:143] round_filter input=192 output=192
I1113 16:36:42.436625 139810192238464 efficientnet_model.py:143] round_filter input=320 output=320
I1113 16:36:42.518593 139810192238464 efficientnet_model.py:143] round_filter input=1280 output=1280
I1113 16:36:42.559586 139810192238464 efficientnet_model.py:453] Building model efficientnet with params ModelConfig(width_coefficient=1.0, depth_coefficient=1.0, resolution=224, dropout_rate=0.2, blocks=(BlockConfig(input_filters=32, output_filters=16, kernel_size=3, num_repeat=1, expand_ratio=1, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=16, output_filters=24, kernel_size=3, num_repeat=2, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=24, output_filters=40, kernel_size=5, num_repeat=2, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=40, output_filters=80, kernel_size=3, num_repeat=3, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=80, output_filters=112, kernel_size=5, num_repeat=3, expand_ratio=6, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=112, output_filters=192, kernel_size=5, num_repeat=4, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=192, output_filters=320, kernel_size=3, num_repeat=1, expand_ratio=6, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise')), stem_base_filters=32, top_base_filters=1280, activation='simple_swish', batch_norm='default', bn_momentum=0.99, bn_epsilon=0.001, weight_decay=5e-06, drop_connect_rate=0.2, depth_divisor=8, min_depth=None, use_se=True, input_channels=3, num_classes=1000, model_name='efficientnet', rescale_input=False, data_format='channels_last', dtype='float32')
INFO:tensorflow:Reading unweighted datasets: ['/content/gdrive/MyDrive/content/test/teams.tfrecord']
I1113 16:36:42.612493 139810192238464 dataset_builder.py:162] Reading unweighted datasets: ['/content/gdrive/MyDrive/content/test/teams.tfrecord']
INFO:tensorflow:Reading record datasets for input file: ['/content/gdrive/MyDrive/content/test/teams.tfrecord']
I1113 16:36:42.612948 139810192238464 dataset_builder.py:79] Reading record datasets for input file: ['/content/gdrive/MyDrive/content/test/teams.tfrecord']
INFO:tensorflow:Number of filenames to read: 1
I1113 16:36:42.613093 139810192238464 dataset_builder.py:80] Number of filenames to read: 1
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W1113 16:36:42.613175 139810192238464 dataset_builder.py:87] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:104: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic.
W1113 16:36:42.616040 139810192238464 deprecation.py:356] From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:104: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:236: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.map() W1113 16:36:42.630358 139810192238464 deprecation.py:356] From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:236: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.map()
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1176: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
W1113 16:36:46.364809 139810192238464 deprecation.py:356] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1176: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1176: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W1113 16:36:47.720357 139810192238464 deprecation.py:356] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1176: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Waiting for new checkpoint at /content/gdrive/MyDrive/content/log_files_barca_bayern
I1113 16:36:50.155207 139810192238464 checkpoint_utils.py:142] Waiting for new checkpoint at /content/gdrive/MyDrive/content/log_files_barca_bayern
INFO:tensorflow:Found new checkpoint at /content/gdrive/MyDrive/content/log_files_barca_bayern/ckpt-11
I1113 16:36:51.766162 139810192238464 checkpoint_utils.py:151] Found new checkpoint at /content/gdrive/MyDrive/content/log_files_barca_bayern/ckpt-11
/usr/local/lib/python3.7/dist-packages/keras/backend.py:452: UserWarning: tf.keras.backend.set_learning_phase is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the training argument of the __call__ method of your layer or model.
"tf.keras.backend.set_learning_phase is deprecated and "
Traceback (most recent call last):
File "/content/gdrive/MyDrive/content/models/research/object_detection/model_main_tf2.py", line 114, in 
tf.compat.v1.app.run()
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/platform/app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/content/gdrive/MyDrive/content/models/research/object_detection/model_main_tf2.py", line 89, in main
wait_interval=300, timeout=FLAGS.eval_timeout)
File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 1164, in eval_continuously
global_step=global_step,
File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 1009, in eager_eval_loop
for evaluator in evaluators:
TypeError: 'NoneType' object is not iterable

This issue happens only when I am trying to evaluate the model and get the performance metrics. There are no issue when I use the trained to model actually identify the objects using test images. In fact I get the final output with bounding boxes and acceptable performance.

Here is my config file:

SSD with EfficientNet-b0 + BiFPN feature extractor,

#shared box predictor and focal loss (a.k.a EfficientDet-d0).

#See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070

#See Lin et al, https://arxiv.org/abs/1708.02002

#Trained on COCO, initialized from an EfficientNet-b0 checkpoint.

#Train on TPU-8

model {
ssd {
inplace_batchnorm_update: true
freeze_batchnorm: false
num_classes: 5
add_background_class: false
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
use_matmul_gather: true
}
}
similarity_calculator {
iou_similarity {
}
}
encode_background_as_zeros: true
anchor_generator {
multiscale_anchor_generator {
min_level: 3
max_level: 7
anchor_scale: 4.0
aspect_ratios: [1.0, 2.0, 0.5]
scales_per_octave: 3
}
}
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 512
max_dimension: 512
pad_to_max_dimension: true
}
}
box_predictor {
weight_shared_convolutional_box_predictor {
depth: 64
class_prediction_bias_init: -4.6
conv_hyperparams {
force_use_bias: true
activation: SWISH
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
random_normal_initializer {
stddev: 0.01
mean: 0.0
}
}
batch_norm {
scale: true
decay: 0.99
epsilon: 0.001
}
}
num_layers_before_predictor: 3
kernel_size: 3
use_depthwise: true
}
}
feature_extractor {
type: 'ssd_efficientnet-b0_bifpn_keras'
bifpn {
min_level: 3
max_level: 7
num_iterations: 3
num_filters: 64
}
conv_hyperparams {
force_use_bias: true
activation: SWISH
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
scale: true,
decay: 0.99,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid_focal {
alpha: 0.25
gamma: 1.5
}
}
localization_loss {
weighted_smooth_l1 {
}
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
normalize_loc_loss_by_codesize: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.5
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}

train_config: {
fine_tune_checkpoint: "/content/gdrive/MyDrive/content/models/research/deploy/efficientdet_d0_coco17_tpu-32/checkpoint/ckpt-0"
fine_tune_checkpoint_version: V2
fine_tune_checkpoint_type: "detection"
batch_size: 16
sync_replicas: true
startup_delay_steps: 0
replicas_to_aggregate: 8
use_bfloat16: true
num_steps: 8000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
random_scale_crop_and_pad_to_square {
output_size: 512
scale_min: 0.1
scale_max: 2.0
}
}
optimizer {
momentum_optimizer: {
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: 8e-2
total_steps: 300000
warmup_learning_rate: .001
warmup_steps: 2500
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
max_number_of_boxes: 100
unpad_groundtruth_tensors: false
}

train_input_reader: {
label_map_path: "/content/gdrive/MyDrive/content/train/teams_label_map.pbtxt"
tf_record_input_reader {
input_path: "/content/gdrive/MyDrive/content/train/teams.tfrecord"
}
}

eval_config: {
metrics_set: "coco_detection_metrics"
use_moving_averages: false
batch_size: 16;
}

eval_input_reader: {
label_map_path: "/content/gdrive/MyDrive/content/train/teams_label_map.pbtxt"
shuffle: false
num_epochs: 1
tf_record_input_reader {
input_path: "/content/gdrive/MyDrive/content/test/teams.tfrecord"
}
}
laxmareddyp commented 1 year ago

Hi @gokulh12,

In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here. Thanks!

gokulh12 commented 1 year ago

Hi @laxmareddyp Here is the complete code:

#mounting drive 
from google.colab import drive
drive.mount('/content/gdrive')

cd /content/gdrive/MyDrive/content

import os
import pathlib

# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
 while "models" in pathlib.Path.cwd().parts:
   os.chdir('..')
elif not pathlib.Path('models').exists():
 !git clone --depth 1 https://github.com/tensorflow/models

# Install the Object Detection API
%%bash
cd /content/gdrive/MyDrive/content/models/research
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .

import matplotlib
import matplotlib.pyplot as plt
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

import tensorflow as tf

from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import colab_utils
from object_detection.builders import model_builder

%matplotlib inline

#run model builder test
!python /content/gdrive/MyDrive/content/models/research/object_detection/builders/model_builder_tf2_test.py

#downloading tfrrecords
%cd /content/gdrive/MyDrive/content
!curl -L "https://app.roboflow.com/ds/UDGPGXHzX8?key=qCqU0Cwfzl" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

##change chosen model to deploy different models available in the TF2 object detection zoo
MODELS_CONFIG = {
   'efficientdet-d0': {
       'model_name': 'efficientdet_d0_coco17_tpu-32',
       'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config',
       'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz',
       'batch_size': 16
   },
   'efficientdet-d1': {
       'model_name': 'efficientdet_d1_coco17_tpu-32',
       'base_pipeline_file': 'ssd_efficientdet_d1_640x640_coco17_tpu-8.config',
       'pretrained_checkpoint': 'efficientdet_d1_coco17_tpu-32.tar.gz',
       'batch_size': 16
   },
   'efficientdet-d2': {
       'model_name': 'efficientdet_d2_coco17_tpu-32',
       'base_pipeline_file': 'ssd_efficientdet_d2_768x768_coco17_tpu-8.config',
       'pretrained_checkpoint': 'efficientdet_d2_coco17_tpu-32.tar.gz',
       'batch_size': 16
   },
       'efficientdet-d3': {
       'model_name': 'efficientdet_d3_coco17_tpu-32',
       'base_pipeline_file': 'ssd_efficientdet_d3_896x896_coco17_tpu-32.config',
       'pretrained_checkpoint': 'efficientdet_d3_coco17_tpu-32.tar.gz',
       'batch_size': 16
   }
}

chosen_model = 'efficientdet-d0'
num_steps = 10000 #The more steps, the longer the training. Increase if your loss function is still decreasing and validation metrics are increasing.
num_eval_steps = 500 #Perform evaluation after so many steps
model_name = MODELS_CONFIG[chosen_model]['model_name']
pretrained_checkpoint = MODELS_CONFIG[chosen_model]['pretrained_checkpoint']
base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file']
batch_size = MODELS_CONFIG[chosen_model]['batch_size'] #if you can fit a large batch in memory, it may speed up your training

#download pretrained weights
%mkdir /content/gdrive/MyDrive/content/models/research/deploy/
%cd /content/gdrive/MyDrive/content/models/research/deploy/
import tarfile
download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' + pretrained_checkpoint
!wget {download_tar}
tar = tarfile.open(pretrained_checkpoint)
tar.extractall()
tar.close()

#download base training configuration file
%cd /content/gdrive/MyDrive/content/models/research/deploy
download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file
!wget {download_config}

pipeline_fname = '/content/gdrive/MyDrive/content/models/research/deploy/' + base_pipeline_file
fine_tune_checkpoint = '/content/gdrive/MyDrive/content/models/research/deploy/' + model_name + '/checkpoint/ckpt-0'

#calculate number of classes 
def get_num_classes(pbtxt_fname):
   from object_detection.utils import label_map_util
   label_map = label_map_util.load_labelmap('/content/gdrive/MyDrive/content/test/teams_label_map.pbtxt')
   categories = label_map_util.convert_label_map_to_categories(
       label_map, max_num_classes=90, use_display_name=True)
   category_index = label_map_util.create_category_index(categories)
   return len(category_index.keys())
num_classes = get_num_classes('/content/gdrive/MyDrive/content/test/teams_label_map.pbtxt')

#read the config file and write file directories 
import re
%cd /content/gdrive/MyDrive/content/models/research/deploy
print('writing custom configuration file')
with open(pipeline_fname) as f:
   s = f.read()
with open('pipeline_file.config', 'w') as f:
   # fine_tune_checkpoint
   s = re.sub('fine_tune_checkpoint: ".*?"',
              'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
     # tfrecord files train and test.
   s = re.sub(
       '(input_path: ".*?)(PATH_TO_BE_CONFIGURED/train)(.*?")', 'input_path: "{}"'.format('/content/gdrive/MyDrive/content/train/teams.tfrecord'), s)
   s = re.sub(
       '(input_path: ".*?)(PATH_TO_BE_CONFIGURED/val)(.*?")', 'input_path: "{}"'.format('/content/gdrive/MyDrive/content/test/teams.tfrecord'), s)
   # label_map_path
   s = re.sub(
       'label_map_path: ".*?"', 'label_map_path: "{}"'.format('/content/gdrive/MyDrive/content/train/teams_label_map.pbtxt'), s)
   # Set training batch_size.
   s = re.sub('batch_size: [0-9]+',
              'batch_size: {}'.format(batch_size), s)

   # Set training steps, num_steps
   s = re.sub('num_steps: [0-9]+',
              'num_steps: {}'.format(num_steps), s)
   # Set number of classes num_classes.
   s = re.sub('num_classes: [0-9]+',
              'num_classes: {}'.format(num_classes), s)
     #fine-tune checkpoint type
   s = re.sub(
       'fine_tune_checkpoint_type: "classification"', 'fine_tune_checkpoint_type: "{}"'.format('detection'), s)

   f.write(s)

%cat /content/gdrive/MyDrive/content/models/research/deploy/pipeline_file.config

pipeline_file = '/content/gdrive/MyDrive/content/models/research/deploy/pipeline_file.config'
model_dir = '/content/gdrive/MyDrive/content/training/'

#training the model 
!python /content/gdrive/MyDrive/content/models/research/object_detection/model_main_tf2.py \
   --pipeline_config_path={pipeline_file} \
   --model_dir={model_dir} \
   --alsologtostderr \
   --num_train_steps={num_steps} \
   --sample_1_of_n_eval_examples=1 \
   --num_eval_steps={num_eval_steps}

#model evaluation
!python /content/gdrive/MyDrive/content/models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={pipeline_file} \
    --model_dir={model_dir} \
    --checkpoint_dir={model_dir}
laxmareddyp commented 1 year ago

Hi

While downloading the model checkpoint , the config file may have mistakes and i have changed the pipeline.config file

model {
  ssd {
    num_classes: 5
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 256
        max_dimension: 256
        pad_to_max_dimension: true
      }
    }
    feature_extractor {
      type: "ssd_efficientnet-b0_bifpn_keras"
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 3.9999998989515007e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.029999999329447746
          }
        }
        activation: SWISH
        batch_norm {
          decay: 0.9900000095367432
          scale: true
          epsilon: 0.0010000000474974513
        }
        force_use_bias: true
      }
      bifpn {
        min_level: 3
        max_level: 7
        num_iterations: 3
        num_filters: 64
      }
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 1.0
        x_scale: 1.0
        height_scale: 1.0
        width_scale: 1.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    box_predictor {
      weight_shared_convolutional_box_predictor {
        conv_hyperparams {
          regularizer {
            l2_regularizer {
              weight: 3.9999998989515007e-05
            }
          }
          initializer {
            random_normal_initializer {
              mean: 0.0
              stddev: 0.009999999776482582
            }
          }
          activation: SWISH
          batch_norm {
            decay: 0.9900000095367432
            scale: true
            epsilon: 0.0010000000474974513
          }
          force_use_bias: true
        }
        depth: 64
        num_layers_before_predictor: 3
        kernel_size: 3
        class_prediction_bias_init: -4.599999904632568
        use_depthwise: true
      }
    }
    anchor_generator {
      multiscale_anchor_generator {
        min_level: 3
        max_level: 7
        anchor_scale: 4.0
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        scales_per_octave: 3
      }
    }
    post_processing {
      batch_non_max_suppression {
        score_threshold: 9.99999993922529e-09
        iou_threshold: 0.5
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
    normalize_loss_by_num_matches: true
    loss {
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_loss {
        weighted_sigmoid_focal {
          gamma: 1.5
          alpha: 0.25
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    encode_background_as_zeros: true
    normalize_loc_loss_by_codesize: true
    inplace_batchnorm_update: true
    freeze_batchnorm: false
    add_background_class: false
  }
}
train_config {
  batch_size: 16
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    random_scale_crop_and_pad_to_square {
      output_size: 128
      scale_min: 0.10000000149011612
      scale_max: 2.0
    }
  }
  sync_replicas: true
  optimizer {
    momentum_optimizer {
      learning_rate {
        cosine_decay_learning_rate {
          learning_rate_base: 0.07999999821186066
          total_steps: 500
          warmup_learning_rate: 0.0010000000474974513
          warmup_steps: 100
        }
      }
      momentum_optimizer_value: 0.8999999761581421
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "/content/efficientdet_d0_coco17_tpu-32/checkpoint/ckpt-0"
  num_steps: 500
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  use_bfloat16: true
  fine_tune_checkpoint_version: V2
}
train_input_reader {
  label_map_path: "/content/roboflow/train/teams_label_map.pbtxt"
  tf_record_input_reader {
    input_path: "/content/roboflow/train/teams.tfrecord"
  }
}

eval_config {
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
  batch_size: 1;
}

eval_input_reader {
  label_map_path: "/content/roboflow/train/teams_label_map.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "/content/roboflow/test/teams.tfrecord"
  }
}

Also find the gist using the above configuration ,I hope this will help you to resolve the issue.

Thanks.

google-ml-butler[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 1 year ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

yadavdgv commented 1 year ago

Still, there is an issue with this model_main_tf2.py approach for evaluation