google / automl

Google Brain AutoML
Apache License 2.0
6.25k stars 1.45k forks source link

Custom dataset 0 mAP #68

Closed CraigWang1 closed 4 years ago

CraigWang1 commented 4 years ago

Has anyone successfully trained on a custom dataset?

I tried training on a custom dataset with one class and 1700 images in colab:

# train!
!python main.py --training_file_pattern=tfrecord/train* --model_dir=models --hparams="use_bfloat16=false,num_classes=1,learning_rate=0.001" --use_tpu=False \
  --model_name='efficientdet-d0' --train_batch_size 16 \
  --validation_file_pattern=tfrecord/val* --val_json_file='/content/unzipped_dset/bin_COCO/data/COCO/annotations/instances_val2017.json' \
  --mode='train_and_eval'

The loss looked normal after a few epochs: edet_0_loss

However, when I did eval, I got 0 mAP: Command:

# eval
!python main.py --hparams="use_bfloat16=false,num_classes=1" --use_tpu=False \
  --model_name='efficientdet-d0' --model_dir=models \
  --validation_file_pattern=tfrecord/val* --val_json_file='/content/unzipped_dset/bin_COCO/data/COCO/annotations/instances_val2017.json' \
  --mode='eval' --logtostderr

Output:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

Has anyone else had this issue?

ancorasir commented 4 years ago

@CraigWang1 , Have you tried to train it for longer time? The loss looks still large. I also tried to train my custom dataset but the loss didn't converge to 0. #25

Besides, I raised another issue on the image normalization. I'm not sure if this relates to that. #69

CraigWang1 commented 4 years ago

@ancorasir Thanks for your reply! After letting it train for a few hours and several more epochs, the mAP is still 0

ancorasir commented 4 years ago

@CraigWang1 How about the loss? Does it keep decreasing or saturate to some value?

CraigWang1 commented 4 years ago

It is decreasing, but only by a tiny bit loss2

WonTaeYeon commented 4 years ago

Did you change only the parameter values so that only one class is learned?

CraigWang1 commented 4 years ago

@WonTaeYeon Yes, I added this flag: --hparams="use_bfloat16=false,num_classes=1" Were you able to train a model on a custom dataset?

WonTaeYeon commented 4 years ago

@CraigWang1 I am also contemplating the same problem. Models with one class change do not work in inference. Have you ever changed TFrecord?

CraigWang1 commented 4 years ago

@WonTaeYeon Yeah I have! I've also tried inferencing, but it doesn't work for me either with

Invalid argument: Input to reshape is a tensor with 36864 values, but the requested shape requires a multiple of 90
     [[{{node Reshape}}]]
     [[strided_slice_8/_1427]]
ALagre commented 4 years ago

Hi @CraigWang1, I am facing the same issue with the coco dataset. Have you trained on coco ? How is your box loss ? I have just set the batch size to 8 and the learning rate to 0.008. Screenshot from 2020-03-27 10-32-21

CraigWang1 commented 4 years ago

Hi @ALagre, no I have not trained on coco. This is my box loss on my custom dataset, it's also all 0s: box loss I think it's very strange how the mAP is 0 even on the COCO dataset because the author already trained on it.

ancorasir commented 4 years ago

@CraigWang1 @ALagre I have similar problem training on my custom dataset of 4 categories and 10k+ images. My training looks ok as below except the loss is not decreasing after 5 epochs. The batch size is 8 and the initial learning rate is about 0.0001 and decrease after 10 epochs. So I guess it's not because my learning rate is too small.

I inferred a single image from the training data itself and it's totally wrong which means the model has not learned the training data yet.

Screenshot from 2020-03-29 11-10-01

mingxingtan commented 4 years ago

How about start from coco-pretrained model and finetune on customer dataset? Does anyone have small example datasets to share?

b03505036 commented 4 years ago

I can do the test, but how to use pretrained model? Could anyone give me an example?

CraigWang1 commented 4 years ago

How about start from coco-pretrained model and finetune on customer dataset? Does anyone have small example datasets to share?

Here is my dataset of about 1900 512x380 images that I'm training on with one 'bin' class (586 MB).

By starting from a coco-pretrained model, do you mean to download it, then start training with the checkpoint except on the custom dataset?

YangZhangMizzou commented 4 years ago

Hello. could you tell me how to put train dir into training_file_pattern? I got Failed precondition: new_bird/train; Is a directory when I try to put dir into it. Thank you!

mingxingtan commented 4 years ago

To initialize models with coco-pretrained weights, you can do something like this:

python main.py --training_file_pattern=/coco_tfrecord/train-00000-of-00256.tfrecord \
    --val_json_file=/coco_tfrecord//annotations/instances_val2017.json \
    --model_name=efficientdet-d0 \
    --model_dir=/tmp/test/ \
    --ckpt=/ckpt/efficientdet/efficientdet-d0 \
    --hparams="use_bfloat16=false" --use_tpu=False

Similar to issue https://github.com/google/automl/issues/40

mingxingtan commented 4 years ago

@CraigWang1 Can you try this way. If you still have issues, could you help convert your dataset into tfrecord by following the tutorial and I would try your dataset. Thanks!

YangZhangMizzou commented 4 years ago

Problem Solved! Thank you!

mingxingtan commented 4 years ago

Also submitted a change that allows you to exclude some variables. For example, I have verified this command line works:

python main.py --training_file_pattern=/coco_tfrecord/train-00000-of-00256.tfrecord \
    --val_json_file=/coco_tfrecord//annotations/instances_val2017.json \
    --model_name=efficientdet-d0 \
    --model_dir=/tmp/test/ \
    --ckpt=/ckpt/efficientdet/efficientdet-d0 \
    --hparams="use_bfloat16=false,num_classes=10,var_exclude_expr=r'.*/class-predict/.*'" \
    --use_tpu=False
CraigWang1 commented 4 years ago

@mingxingtan Thank you for your suggestions! Using a pretrained backbone, the loss looked much better, but the eval mAP was still 0 and I had trouble using the model to predict.

Train command:

!python main.py \
    --training_file_pattern=tfrecord/train* \
    --val_json_file=/coco_tfrecord//annotations/instances_val2017.json \
    --model_name=efficientdet-d0 \
    --model_dir=models \
    --ckpt=efficientdet-d0 \
    --hparams="use_bfloat16=false,num_classes=1,var_exclude_expr=r'.*/class-predict/.*'" \
    --train_batch_size 4 \
    --use_tpu=False

Loss: better_loss Eval command:

# eval
!python main.py  \
  --validation_file_pattern=tfrecord/val* \
  --val_json_file='/content/unzipped_dset/bin_COCO/data/COCO/annotations/instances_val2017.json' \
  --hparams="use_bfloat16=false,num_classes=1,var_exclude_expr=r'.*/class-predict/.*'" \
  --use_tpu=False \
  --model_name='efficientdet-d0' \
  --model_dir=models \
  --mode='eval'

Eval results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
INFO:tensorflow:Inference Time : 21.57176s
I0329 20:35:12.562011 140277525276544 evaluation.py:273] Inference Time : 21.57176s

Inference code:

# Inference
import tensorflow.compat.v1 as tf

img_path = '/content/729.png'
img_out_dir = '/content'
image_size = 512

MODEL = 'efficientdet-d0'
ckpt_path = '/content/automl/efficientdet/models'

min_score_thresh = 0.2  #@param
max_boxes_to_draw = 100  #@param
line_thickness = 4  #@param

# call InferenceDriver
import inference
tf.reset_default_graph()
driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size, label_id_mapping={0:'bin'})
driver.inference(img_path,
                 img_out_dir,
                 min_score_thresh=min_score_thresh,
                 max_boxes_to_draw=max_boxes_to_draw,
                 line_thickness=line_thickness)

Inference error: Input to reshape is a tensor with 36864 values, but the requested shape requires a multiple of 90 I tried changing num_classes in hparams_config.py to 1 for infencing, but it still gave me this error. This is very puzzling. Here is my tfrecord dataset.

watertianyi commented 4 years ago

I used my own data set to train in three categories, and the results were as follows: ValueError: Shape of variable class_net/class-predict/bias:0 ((27,)) doesn't match with shape of tensor class_net/class-predict/bias ([810]) from checkpoint reader.

The command is as follows: python main.py --training_file_pattern=tfrecrod_gstrain/train* \ --model_name=efficientdet-d0 \ --model_dir=ckpt/gs \ --ckpt=efficientdet-d0 \ --hparams="use_bfloat16=false,num_classes=3" --use_tpu=False

ALagre commented 4 years ago

Hi @CraigWang1, @ancorasir

I made a mistake recording the tfrecords. This is why the box_loss was 0s and the mAP as well. I will train coco with the correct tfrecords. Cheers

CraigWang1 commented 4 years ago

Hi @ALagre So now you can get an mAP? I also tried evaluating on my train set that I could train on so I knew that the tfrecords were valid, but I still got the same results with mAP=0. Have you fixed your issue?

@honghande It seems like you already solved your problem in your recent issue, but if it's still of use to you my train command is posted above ^.

ancorasir commented 4 years ago

@CraigWang1 Do you mean you can get a valid mAP on your train set?

ALagre commented 4 years ago

Hi @CraigWang1, I am still training. The mAP increases on tensorboard but it is far from being the expected mAP (33.5).

CraigWang1 commented 4 years ago

@ancorasir No, the mAP is still invalid when evaluating on the train dataset.

@ALagre Are you training d0?

liminghuiv commented 4 years ago

@mingxingtan Thank you for your suggestions! Using a pretrained backbone, the loss looked much better, but the eval mAP was still 0 and I had trouble using the model to predict.

Train command:

!python main.py \
    --training_file_pattern=tfrecord/train* \
    --val_json_file=/coco_tfrecord//annotations/instances_val2017.json \
    --model_name=efficientdet-d0 \
    --model_dir=models \
    --ckpt=efficientdet-d0 \
    --hparams="use_bfloat16=false,num_classes=1,var_exclude_expr=r'.*/class-predict/.*'" \
    --train_batch_size 4 \
    --use_tpu=False

Loss: better_loss Eval command:

# eval
!python main.py  \
  --validation_file_pattern=tfrecord/val* \
  --val_json_file='/content/unzipped_dset/bin_COCO/data/COCO/annotations/instances_val2017.json' \
  --hparams="use_bfloat16=false,num_classes=1,var_exclude_expr=r'.*/class-predict/.*'" \
  --use_tpu=False \
  --model_name='efficientdet-d0' \
  --model_dir=models \
  --mode='eval'

Eval results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
INFO:tensorflow:Inference Time : 21.57176s
I0329 20:35:12.562011 140277525276544 evaluation.py:273] Inference Time : 21.57176s

Inference code:

# Inference
import tensorflow.compat.v1 as tf

img_path = '/content/729.png'
img_out_dir = '/content'
image_size = 512

MODEL = 'efficientdet-d0'
ckpt_path = '/content/automl/efficientdet/models'

min_score_thresh = 0.2  #@param
max_boxes_to_draw = 100  #@param
line_thickness = 4  #@param

# call InferenceDriver
import inference
tf.reset_default_graph()
driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size, label_id_mapping={0:'bin'})
driver.inference(img_path,
                 img_out_dir,
                 min_score_thresh=min_score_thresh,
                 max_boxes_to_draw=max_boxes_to_draw,
                 line_thickness=line_thickness)

Inference error: Input to reshape is a tensor with 36864 values, but the requested shape requires a multiple of 90 I tried changing num_classes in hparams_config.py to 1 for infencing, but it still gave me this error. This is very puzzling. Here is my tfrecord dataset.

@CraigWang1, Just curious, since you're training on a custom dataset, why do you need to set COCO validation? val_json_file=/coco_tfrecord//annotations/instances_val2017.json

HaoyuanPeng commented 4 years ago

I have trained successfully on my custom dataset with 2 classes on a sinigle P40 GPU using efficientdet-d4. My training command is:

CUDA_VISIBLE_DEVICES=0 python main.py --training_file_pattern=train.tfrecord --model_name=efficientdet-d4 --model_dir=experiments/model_dir/ --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=3500 --num_epochs=2 --ckpt=/root/data/checkpoints/efficientdet/efficientdet-d4 --validation_file_pattern=valid.tfrecord --eval_after_training=True --train_batch_size=2

I use tensorflow=1.15.0, and I replaced all the import tensorflow.compat.v1 as tf with import tensorflow as tf.

My data is labeled as VOC format, and converted to tfrecord by this script: script, BUT I changed its line 129~130 to: 'image/source_id': dataset_util.bytes_feature("".encode('utf8')),, since the efficientdet code will convert the source_id to integer. (Not exactly running this script, but borrowing its dict_to_tf_example function.

After training for 2 epochs, I have got a reasonable mAP. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.783 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.975 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.889 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.832 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.822 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.836 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.840 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.840 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

watertianyi commented 4 years ago

@HaoyuanPeng I convert the JSON marked by my data labelme to coco data, and then use the following command to convert to tfrecord format: 1.Training data into coco PYTHONPATH=".:$PYTHONPATH" python dataset/create_coco_tfrecord.py \

--image_dir=/data/internet/AI/dataset/label2coco_train/train \
--object_annotations_file=/data/internet/AI/dataset/label2coco_train/train.json \
--output_file_prefix=tfrecrod_gstrain/train \
--num_shards=32

2020-04-01 11:20:46.690534: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6 2020-04-01 11:20:46.691404: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6 I0401 11:20:47.144516 139998122432320 create_coco_tfrecord.py:288] writing to output path: tfrecrod_gstrain/train I0401 11:20:48.073305 139998122432320 create_coco_tfrecord.py:218] Building bounding box index. I0401 11:20:48.089581 139998122432320 create_coco_tfrecord.py:229] 0 images are missing bboxes. I0401 11:20:48.181215 139998122432320 create_coco_tfrecord.py:326] On image 0 of 3997 I0401 11:20:56.034793 139998122432320 create_coco_tfrecord.py:326] On image 100 of 3997 I0401 11:21:05.621823 139998122432320 create_coco_tfrecord.py:326] On image 200 of 3997 I0401 11:21:15.368576 139998122432320 create_coco_tfrecord.py:326] On image 300 of 3997 I0401 11:21:16.584230 139998122432320 create_coco_tfrecord.py:326] On image 400 of 3997 I0401 11:21:16.617091 139998122432320 create_coco_tfrecord.py:326] On image 500 of 3997 I0401 11:21:16.650906 139998122432320 create_coco_tfrecord.py:326] On image 600 of 3997 I0401 11:21:16.678575 139998122432320 create_coco_tfrecord.py:326] On image 700 of 3997 I0401 11:21:16.708966 139998122432320 create_coco_tfrecord.py:326] On image 800 of 3997 I0401 11:21:16.741325 139998122432320 create_coco_tfrecord.py:326] On image 900 of 3997 I0401 11:21:16.771920 139998122432320 create_coco_tfrecord.py:326] On image 1000 of 3997 I0401 11:21:16.798019 139998122432320 create_coco_tfrecord.py:326] On image 1100 of 3997 I0401 11:21:16.832669 139998122432320 create_coco_tfrecord.py:326] On image 1200 of 3997 I0401 11:21:16.865559 139998122432320 create_coco_tfrecord.py:326] On image 1300 of 3997 I0401 11:21:16.888181 139998122432320 create_coco_tfrecord.py:326] On image 1400 of 3997 I0401 11:21:16.912644 139998122432320 create_coco_tfrecord.py:326] On image 1500 of 3997 I0401 11:21:16.939918 139998122432320 create_coco_tfrecord.py:326] On image 1600 of 3997 I0401 11:21:16.971221 139998122432320 create_coco_tfrecord.py:326] On image 1700 of 3997 I0401 11:21:16.997235 139998122432320 create_coco_tfrecord.py:326] On image 1800 of 3997 I0401 11:21:17.026868 139998122432320 create_coco_tfrecord.py:326] On image 1900 of 3997 I0401 11:21:17.055303 139998122432320 create_coco_tfrecord.py:326] On image 2000 of 3997 I0401 11:21:17.083970 139998122432320 create_coco_tfrecord.py:326] On image 2100 of 3997 I0401 11:21:17.116072 139998122432320 create_coco_tfrecord.py:326] On image 2200 of 3997 I0401 11:21:17.139394 139998122432320 create_coco_tfrecord.py:326] On image 2300 of 3997 I0401 11:21:17.169246 139998122432320 create_coco_tfrecord.py:326] On image 2400 of 3997 I0401 11:21:17.204591 139998122432320 create_coco_tfrecord.py:326] On image 2500 of 3997 I0401 11:21:17.237477 139998122432320 create_coco_tfrecord.py:326] On image 2600 of 3997 I0401 11:21:17.256804 139998122432320 create_coco_tfrecord.py:326] On image 2700 of 3997 I0401 11:21:17.290512 139998122432320 create_coco_tfrecord.py:326] On image 2800 of 3997 I0401 11:21:17.317072 139998122432320 create_coco_tfrecord.py:326] On image 2900 of 3997 I0401 11:21:17.350397 139998122432320 create_coco_tfrecord.py:326] On image 3000 of 3997 I0401 11:21:17.378591 139998122432320 create_coco_tfrecord.py:326] On image 3100 of 3997 I0401 11:21:17.399793 139998122432320 create_coco_tfrecord.py:326] On image 3200 of 3997 I0401 11:21:17.432787 139998122432320 create_coco_tfrecord.py:326] On image 3300 of 3997 I0401 11:21:17.462066 139998122432320 create_coco_tfrecord.py:326] On image 3400 of 3997 I0401 11:21:17.489998 139998122432320 create_coco_tfrecord.py:326] On image 3500 of 3997 I0401 11:21:17.518354 139998122432320 create_coco_tfrecord.py:326] On image 3600 of 3997 I0401 11:21:17.538947 139998122432320 create_coco_tfrecord.py:326] On image 3700 of 3997 I0401 11:21:17.567288 139998122432320 create_coco_tfrecord.py:326] On image 3800 of 3997 I0401 11:21:17.593461 139998122432320 create_coco_tfrecord.py:326] On image 3900 of 3997 I0401 11:21:17.634522 139998122432320 create_coco_tfrecord.py:338] Finished writing, skipped 0 annotations. 2.Testing data into coco PYTHONPATH=".:$PYTHONPATH" python dataset/create_coco_tfrecord.py --image_dir=/data/internet/AI/dataset/label2coco_val/val --object_annotations_file=/data/internet/AI/dataset/label2coco_val/val.json --output_file_prefix=tfrecrod_gsval/val --num_shards=32

I0401 10:07:09.788161 140295929337664 create_coco_tfrecord.py:218] Building bounding box index. I0401 10:07:09.791546 140295929337664 create_coco_tfrecord.py:229] 0 images are missing bboxes. I0401 10:07:09.817166 140295929337664 create_coco_tfrecord.py:326] On image 0 of 1000 I0401 10:07:09.864326 140295929337664 create_coco_tfrecord.py:326] On image 100 of 1000 I0401 10:07:09.934556 140295929337664 create_coco_tfrecord.py:326] On image 200 of 1000 I0401 10:07:10.013778 140295929337664 create_coco_tfrecord.py:326] On image 300 of 1000 I0401 10:07:10.051057 140295929337664 create_coco_tfrecord.py:326] On image 400 of 1000 I0401 10:07:10.088418 140295929337664 create_coco_tfrecord.py:326] On image 500 of 1000 I0401 10:07:10.122422 140295929337664 create_coco_tfrecord.py:326] On image 600 of 1000 I0401 10:07:10.142740 140295929337664 create_coco_tfrecord.py:326] On image 700 of 1000 I0401 10:07:10.181127 140295929337664 create_coco_tfrecord.py:326] On image 800 of 1000 I0401 10:07:10.196460 140295929337664 create_coco_tfrecord.py:326] On image 900 of 1000 I0401 10:07:10.319193 140295929337664 create_coco_tfrecord.py:338] Finished writing, skipped 0 annotations. 3.Then train to use this command,

python main.py --training_file_pattern=tfrecrod_gstrain/train* \ --model_name=efficientdet-d0 \ --model_dir=ckpt/pzy_v1 \ --backbone_ckpt=efficientnet-b0 \ --num_epochs=15 \ --hparams="use_bfloat16=false,num_classes=3" --use_tpu=False The results are as follows: 2020-04-01 11:17:04.569717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6 2020-04-01 11:17:04.570608: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6 WARNING:tensorflow:From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/compat/v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term WARNING:tensorflow:From main.py:222: The name tf.estimator.tpu.TPUConfig is deprecated. Please use tf.compat.v1.estimator.tpu.TPUConfig instead.

W0401 11:17:05.489758 140447485675328 module_wrapper.py:138] From main.py:222: The name tf.estimator.tpu.TPUConfig is deprecated. Please use tf.compat.v1.estimator.tpu.TPUConfig instead.

WARNING:tensorflow:From main.py:227: The name tf.estimator.tpu.InputPipelineConfig is deprecated. Please use tf.compat.v1.estimator.tpu.InputPipelineConfig instead.

W0401 11:17:05.489877 140447485675328 module_wrapper.py:138] From main.py:227: The name tf.estimator.tpu.InputPipelineConfig is deprecated. Please use tf.compat.v1.estimator.tpu.InputPipelineConfig instead.

WARNING:tensorflow:From main.py:230: The name tf.estimator.tpu.RunConfig is deprecated. Please use tf.compat.v1.estimator.tpu.RunConfig instead.

W0401 11:17:05.490092 140447485675328 module_wrapper.py:138] From main.py:230: The name tf.estimator.tpu.RunConfig is deprecated. Please use tf.compat.v1.estimator.tpu.RunConfig instead.

I0401 11:17:05.490272 140447485675328 main.py:242] {'name': 'efficientdet-d1', 'image_size': 640, 'input_rand_hflip': True, 'train_scale_min': 0.1, 'train_scale_max': 2.0, 'autoaugment_policy': None, 'num_classes': 90, 'skip_crowd_during_training': True, 'min_level': 3, 'max_level': 7, 'num_scales': 3, 'aspect_ratios': [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)], 'anchor_scale': 4.0, 'is_training_bn': True, 'momentum': 0.9, 'learning_rate': 0.001, 'lr_warmup_init': 0.0001, 'lr_warmup_epoch': 1.0, 'first_lr_drop_epoch': 200.0, 'second_lr_drop_epoch': 250.0, 'clip_gradients_norm': 10.0, 'num_epochs': 15, 'alpha': 0.25, 'gamma': 1.5, 'delta': 0.1, 'box_loss_weight': 50.0, 'weight_decay': 4e-05, 'use_bfloat16': True, 'box_class_repeats': 3, 'fpn_cell_repeats': 4, 'fpn_num_filters': 88, 'separable_conv': True, 'apply_bn_for_resampling': True, 'conv_after_downsample': False, 'conv_bn_relu_pattern': False, 'use_native_resize_op': False, 'pooling_type': None, 'fpn_name': None, 'fpn_config': None, 'survival_prob': None, 'lr_decay_method': 'cosine', 'moving_average_decay': 0.9998, 'ckpt_var_scope': None, 'var_exclude_expr': None, 'backbone_name': 'efficientnet-b1', 'backbone_config': None, 'resnet_depth': 50, 'model_name': 'efficientdet-d1', 'iterations_per_loop': 100, 'model_dir': None, 'num_shards': 8, 'num_examples_per_epoch': 3997, 'use_tpu': False, 'backbone_ckpt': '', 'ckpt': None, 'val_json_file': None, 'testdev_dir': None, 'mode': 'train'} WARNING:tensorflow:From main.py:244: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

W0401 11:17:05.490377 140447485675328 module_wrapper.py:138] From main.py:244: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmprznylgcq W0401 11:17:05.490788 140447485675328 estimator.py:1825] Using temporary folder as model directory: /tmp/tmprznylgcq INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmprznylgcq', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=100, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None} I0401 11:17:05.491245 140447485675328 estimator.py:216] Using config: {'_model_dir': '/tmp/tmprznylgcq', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=100, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1), '_cluster': None} INFO:tensorflow:_TPUContext: eval_on_tpu True I0401 11:17:05.491445 140447485675328 tpu_context.py:221] _TPUContext: eval_on_tpu True WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False. W0401 11:17:05.491695 140447485675328 tpu_context.py:223] eval_on_tpu ignored because use_tpu is False. WARNING:tensorflow:From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1635: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass _constraint arguments to layers. W0401 11:17:05.495107 140447485675328 deprecation.py:506] From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1635: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass _constraint arguments to layers. WARNING:tensorflow:From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. W0401 11:17:05.495425 140447485675328 deprecation.py:323] From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts. WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/dataloader.py:353: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_deterministic. W0401 11:17:05.515601 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/dataloader.py:353: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_deterministic. WARNING:tensorflow:AutoGraph could not transform <function InputReader.call.._dataset_parser at 0x7fbc16827510> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: LIVE_VARS_IN W0401 11:17:05.877706 140447485675328 ag_logging.py:146] AutoGraph could not transform <function InputReader.call.._dataset_parser at 0x7fbc16827510> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: LIVE_VARS_IN 2020-04-01 11:17:05.903845: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2020-04-01 11:17:05.929474: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-04-01 11:17:05.929995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5 coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s 2020-04-01 11:17:05.930016: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-04-01 11:17:05.930035: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-04-01 11:17:05.930698: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-04-01 11:17:05.930850: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-04-01 11:17:05.931686: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-04-01 11:17:05.932317: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-04-01 11:17:05.932340: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-04-01 11:17:05.932380: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-04-01 11:17:05.932940: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-04-01 11:17:05.933390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0 WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/dataloader.py:71: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. W0401 11:17:05.988429 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/dataloader.py:71: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/dataloader.py:76: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. W0401 11:17:05.991604 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/dataloader.py:76: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. INFO:tensorflow:Calling model_fn. I0401 11:17:06.354847 140447485675328 estimator.py:1151] Calling model_fn. INFO:tensorflow:Running train on CPU I0401 11:17:06.354991 140447485675328 tpu_estimator.py:3124] Running train on CPU I0401 11:17:06.355445 140447485675328 efficientdet_arch.py:548] { "name": "efficientdet-d1", "image_size": 640, "input_rand_hflip": true, "train_scale_min": 0.1, "train_scale_max": 2.0, "autoaugment_policy": null, "num_classes": 90, "skip_crowd_during_training": true, "min_level": 3, "max_level": 7, "num_scales": 3, "aspect_ratios": [ [ 1.0, 1.0 ], [ 1.4, 0.7 ], [ 0.7, 1.4 ] ], "anchor_scale": 4.0, "is_training_bn": true, "momentum": 0.9, "learning_rate": 0.001, "lr_warmup_init": 0.0001, "lr_warmup_epoch": 1.0, "first_lr_drop_epoch": 200.0, "second_lr_drop_epoch": 250.0, "clip_gradients_norm": 10.0, "num_epochs": 15, "alpha": 0.25, "gamma": 1.5, "delta": 0.1, "box_loss_weight": 50.0, "weight_decay": 4e-05, "use_bfloat16": true, "box_class_repeats": 3, "fpn_cell_repeats": 4, "fpn_num_filters": 88, "separable_conv": true, "apply_bn_for_resampling": true, "conv_after_downsample": false, "conv_bn_relu_pattern": false, "use_native_resize_op": false, "pooling_type": null, "fpn_name": null, "fpn_config": null, "survival_prob": null, "lr_decay_method": "cosine", "moving_average_decay": 0.9998, "ckpt_var_scope": null, "var_exclude_expr": null, "backbone_name": "efficientnet-b1", "backbone_config": null, "resnet_depth": 50, "model_name": "efficientdet-d1", "iterations_per_loop": 100, "model_dir": null, "num_shards": 8, "num_examples_per_epoch": 3997, "use_tpu": false, "backbone_ckpt": "", "ckpt": null, "val_json_file": null, "testdev_dir": null, "mode": "train", "batch_size": 8 } I0401 11:17:06.355846 140447485675328 efficientnet_builder.py:220] global_params= GlobalParams(batch_norm_momentum=0.99, batch_norm_epsilon=0.001, dropout_rate=0.2, data_format='channels_last', num_classes=1000, width_coefficient=1.0, depth_coefficient=1.1, depth_divisor=8, min_depth=None, survival_prob=0.8, relu_fn=<tensorflow.python.ops.custom_gradient.Bind object at 0x7fbc73605e10>, batch_norm=<class 'utils.TpuBatchNormalization'>, use_se=True, local_pooling=None, condconv_num_experts=None, clip_projection_output=False, blocks_args=['r1_k3_s11_e1_i32_o16_se0.25', 'r2_k3_s22_e6_i16_o24_se0.25', 'r2_k5_s22_e6_i24_o40_se0.25', 'r3_k3_s22_e6_i40_o80_se0.25', 'r3_k5_s11_e6_i80_o112_se0.25', 'r4_k5_s22_e6_i112_o192_se0.25', 'r1_k3_s11_e6_i192_o320_se0.25'], fix_head_stem=None) I0401 11:17:06.362371 140447485675328 efficientnet_model.py:151] round_filter input=32 output=32 I0401 11:17:06.364523 140447485675328 efficientnet_model.py:151] round_filter input=32 output=32 I0401 11:17:06.364591 140447485675328 efficientnet_model.py:151] round_filter input=16 output=16 I0401 11:17:06.370417 140447485675328 efficientnet_model.py:151] round_filter input=16 output=16 I0401 11:17:06.370484 140447485675328 efficientnet_model.py:151] round_filter input=24 output=24 I0401 11:17:06.378257 140447485675328 efficientnet_model.py:151] round_filter input=24 output=24 I0401 11:17:06.378338 140447485675328 efficientnet_model.py:151] round_filter input=40 output=40 I0401 11:17:06.386562 140447485675328 efficientnet_model.py:151] round_filter input=40 output=40 I0401 11:17:06.386665 140447485675328 efficientnet_model.py:151] round_filter input=80 output=80 I0401 11:17:06.397069 140447485675328 efficientnet_model.py:151] round_filter input=80 output=80 I0401 11:17:06.397136 140447485675328 efficientnet_model.py:151] round_filter input=112 output=112 I0401 11:17:06.408044 140447485675328 efficientnet_model.py:151] round_filter input=112 output=112 I0401 11:17:06.408226 140447485675328 efficientnet_model.py:151] round_filter input=192 output=192 I0401 11:17:06.421809 140447485675328 efficientnet_model.py:151] round_filter input=192 output=192 I0401 11:17:06.421942 140447485675328 efficientnet_model.py:151] round_filter input=320 output=320 I0401 11:17:06.427627 140447485675328 efficientnet_model.py:151] round_filter input=1280 output=1280 I0401 11:17:07.508053 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:08.618961 140447485675328 api.py:567] Built stem layers with output shape: (8, 320, 320, 32) I0401 11:17:10.408538 140447485675328 api.py:567] block_0 survival_prob: 1.0 I0401 11:17:10.898904 140447485675328 api.py:567] Block input: efficientnet-b1/model/stem/IdentityN:0 shape: (8, 320, 320, 32) I0401 11:17:10.906299 140447485675328 api.py:567] Block input depth: 32 output depth: 16 I0401 11:17:10.928936 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:10.939829 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_0/IdentityN:0 shape: (8, 320, 320, 32) I0401 11:17:11.087317 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 32) I0401 11:17:11.111052 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.121262 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_0/Identity_1:0 shape: (8, 320, 320, 16) I0401 11:17:11.164160 140447485675328 api.py:567] block_1 survival_prob: 0.991304347826087 I0401 11:17:11.171775 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_0/Identity_1:0 shape: (8, 320, 320, 16) I0401 11:17:11.179184 140447485675328 api.py:567] Block input depth: 16 output depth: 16 I0401 11:17:11.202677 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.214630 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_1/IdentityN:0 shape: (8, 320, 320, 16) I0401 11:17:11.242571 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 16) I0401 11:17:11.265566 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/utils.py:295: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. W0401 11:17:11.380511 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/utils.py:295: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. I0401 11:17:11.383862 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_1/Add_1:0 shape: (8, 320, 320, 16) I0401 11:17:11.392510 140447485675328 api.py:567] block_2 survival_prob: 0.9826086956521739 I0401 11:17:11.400134 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_1/Add_1:0 shape: (8, 320, 320, 16) I0401 11:17:11.407995 140447485675328 api.py:567] Block input depth: 16 output depth: 24 I0401 11:17:11.432069 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.443425 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_2/IdentityN:0 shape: (8, 320, 320, 96) I0401 11:17:11.466796 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.478679 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_2/IdentityN_1:0 shape: (8, 160, 160, 96) I0401 11:17:11.506748 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 96) I0401 11:17:11.530559 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.540948 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_2/Identity_2:0 shape: (8, 160, 160, 24) I0401 11:17:11.549849 140447485675328 api.py:567] block_3 survival_prob: 0.9739130434782609 I0401 11:17:11.557753 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_2/Identity_2:0 shape: (8, 160, 160, 24) I0401 11:17:11.565325 140447485675328 api.py:567] Block input depth: 24 output depth: 24 I0401 11:17:11.589020 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.601058 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_3/IdentityN:0 shape: (8, 160, 160, 144) I0401 11:17:11.624622 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.636502 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_3/IdentityN_1:0 shape: (8, 160, 160, 144) I0401 11:17:11.664783 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 144) I0401 11:17:11.688481 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.705095 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_3/Add_1:0 shape: (8, 160, 160, 24) I0401 11:17:11.714027 140447485675328 api.py:567] block_4 survival_prob: 0.9652173913043478 I0401 11:17:11.721817 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_3/Add_1:0 shape: (8, 160, 160, 24) I0401 11:17:11.729782 140447485675328 api.py:567] Block input depth: 24 output depth: 24 I0401 11:17:11.753845 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.765280 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_4/IdentityN:0 shape: (8, 160, 160, 144) I0401 11:17:11.788952 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.800723 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_4/IdentityN_1:0 shape: (8, 160, 160, 144) I0401 11:17:11.828995 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 144) I0401 11:17:11.853656 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.870631 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_4/Add_1:0 shape: (8, 160, 160, 24) I0401 11:17:11.879481 140447485675328 api.py:567] block_5 survival_prob: 0.9565217391304348 I0401 11:17:11.887313 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_4/Add_1:0 shape: (8, 160, 160, 24) I0401 11:17:11.895198 140447485675328 api.py:567] Block input depth: 24 output depth: 40 I0401 11:17:11.919215 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.930919 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_5/IdentityN:0 shape: (8, 160, 160, 144) I0401 11:17:11.955210 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:11.966656 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_5/IdentityN_1:0 shape: (8, 80, 80, 144) I0401 11:17:11.995387 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 144) I0401 11:17:12.019971 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.030410 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_5/Identity_2:0 shape: (8, 80, 80, 40) I0401 11:17:12.039200 140447485675328 api.py:567] block_6 survival_prob: 0.9478260869565217 I0401 11:17:12.046988 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_5/Identity_2:0 shape: (8, 80, 80, 40) I0401 11:17:12.054932 140447485675328 api.py:567] Block input depth: 40 output depth: 40 I0401 11:17:12.079024 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.090787 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_6/IdentityN:0 shape: (8, 80, 80, 240) I0401 11:17:12.114944 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.126407 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_6/IdentityN_1:0 shape: (8, 80, 80, 240) I0401 11:17:12.155134 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 240) I0401 11:17:12.180275 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.196885 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_6/Add_1:0 shape: (8, 80, 80, 40) I0401 11:17:12.205489 140447485675328 api.py:567] block_7 survival_prob: 0.9391304347826087 I0401 11:17:12.213528 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_6/Add_1:0 shape: (8, 80, 80, 40) I0401 11:17:12.221233 140447485675328 api.py:567] Block input depth: 40 output depth: 40 I0401 11:17:12.245481 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.257517 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_7/IdentityN:0 shape: (8, 80, 80, 240) I0401 11:17:12.281558 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.293389 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_7/IdentityN_1:0 shape: (8, 80, 80, 240) I0401 11:17:12.322316 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 240) I0401 11:17:12.346817 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.363402 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_7/Add_1:0 shape: (8, 80, 80, 40) I0401 11:17:12.372032 140447485675328 api.py:567] block_8 survival_prob: 0.9304347826086956 I0401 11:17:12.379809 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_7/Add_1:0 shape: (8, 80, 80, 40) I0401 11:17:12.387402 140447485675328 api.py:567] Block input depth: 40 output depth: 80 I0401 11:17:12.411220 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.423164 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_8/IdentityN:0 shape: (8, 80, 80, 240) I0401 11:17:12.447108 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.459198 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_8/IdentityN_1:0 shape: (8, 40, 40, 240) I0401 11:17:12.488512 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 240) I0401 11:17:12.511897 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.522759 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_8/Identity_2:0 shape: (8, 40, 40, 80) I0401 11:17:12.531396 140447485675328 api.py:567] block_9 survival_prob: 0.9217391304347826 I0401 11:17:12.539493 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_8/Identity_2:0 shape: (8, 40, 40, 80) I0401 11:17:12.547107 140447485675328 api.py:567] Block input depth: 80 output depth: 80 I0401 11:17:12.571151 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.583297 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_9/IdentityN:0 shape: (8, 40, 40, 480) I0401 11:17:12.607390 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.619399 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_9/IdentityN_1:0 shape: (8, 40, 40, 480) I0401 11:17:12.648200 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 480) I0401 11:17:12.671922 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.688676 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_9/Add_1:0 shape: (8, 40, 40, 80) I0401 11:17:12.697640 140447485675328 api.py:567] block_10 survival_prob: 0.9130434782608696 I0401 11:17:12.705472 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_9/Add_1:0 shape: (8, 40, 40, 80) I0401 11:17:12.713206 140447485675328 api.py:567] Block input depth: 80 output depth: 80 I0401 11:17:12.737577 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.749275 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_10/IdentityN:0 shape: (8, 40, 40, 480) I0401 11:17:12.773142 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.785616 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_10/IdentityN_1:0 shape: (8, 40, 40, 480) I0401 11:17:12.853349 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 480) I0401 11:17:12.877509 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.893954 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_10/Add_1:0 shape: (8, 40, 40, 80) I0401 11:17:12.902812 140447485675328 api.py:567] block_11 survival_prob: 0.9043478260869565 I0401 11:17:12.910569 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_10/Add_1:0 shape: (8, 40, 40, 80) I0401 11:17:12.918563 140447485675328 api.py:567] Block input depth: 80 output depth: 80 I0401 11:17:12.943099 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.954890 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_11/IdentityN:0 shape: (8, 40, 40, 480) I0401 11:17:12.980033 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:12.991573 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_11/IdentityN_1:0 shape: (8, 40, 40, 480) I0401 11:17:13.020888 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 480) I0401 11:17:13.045731 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.062323 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_11/Add_1:0 shape: (8, 40, 40, 80) I0401 11:17:13.070899 140447485675328 api.py:567] block_12 survival_prob: 0.8956521739130435 I0401 11:17:13.079041 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_11/Add_1:0 shape: (8, 40, 40, 80) I0401 11:17:13.086719 140447485675328 api.py:567] Block input depth: 80 output depth: 112 I0401 11:17:13.110853 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.122871 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_12/IdentityN:0 shape: (8, 40, 40, 480) I0401 11:17:13.147651 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.159720 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_12/IdentityN_1:0 shape: (8, 40, 40, 480) I0401 11:17:13.188863 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 480) I0401 11:17:13.212884 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.223750 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_12/Identity_2:0 shape: (8, 40, 40, 112) I0401 11:17:13.232351 140447485675328 api.py:567] block_13 survival_prob: 0.8869565217391304 I0401 11:17:13.240568 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_12/Identity_2:0 shape: (8, 40, 40, 112) I0401 11:17:13.248084 140447485675328 api.py:567] Block input depth: 112 output depth: 112 I0401 11:17:13.272414 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.284732 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_13/IdentityN:0 shape: (8, 40, 40, 672) I0401 11:17:13.308932 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.321249 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_13/IdentityN_1:0 shape: (8, 40, 40, 672) I0401 11:17:13.350370 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 672) I0401 11:17:13.374377 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.391131 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_13/Add_1:0 shape: (8, 40, 40, 112) I0401 11:17:13.400143 140447485675328 api.py:567] block_14 survival_prob: 0.8782608695652174 I0401 11:17:13.408006 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_13/Add_1:0 shape: (8, 40, 40, 112) I0401 11:17:13.415626 140447485675328 api.py:567] Block input depth: 112 output depth: 112 I0401 11:17:13.440392 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.452001 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_14/IdentityN:0 shape: (8, 40, 40, 672) I0401 11:17:13.475884 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.488058 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_14/IdentityN_1:0 shape: (8, 40, 40, 672) I0401 11:17:13.517079 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 672) I0401 11:17:13.541336 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.558108 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_14/Add_1:0 shape: (8, 40, 40, 112) I0401 11:17:13.566864 140447485675328 api.py:567] block_15 survival_prob: 0.8695652173913044 I0401 11:17:13.574521 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_14/Add_1:0 shape: (8, 40, 40, 112) I0401 11:17:13.582489 140447485675328 api.py:567] Block input depth: 112 output depth: 112 I0401 11:17:13.607089 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.619120 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_15/IdentityN:0 shape: (8, 40, 40, 672) I0401 11:17:13.643513 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.655206 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_15/IdentityN_1:0 shape: (8, 40, 40, 672) I0401 11:17:13.684916 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 672) I0401 11:17:13.710236 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.726956 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_15/Add_1:0 shape: (8, 40, 40, 112) I0401 11:17:13.735668 140447485675328 api.py:567] block_16 survival_prob: 0.8608695652173913 I0401 11:17:13.743824 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_15/Add_1:0 shape: (8, 40, 40, 112) I0401 11:17:13.751503 140447485675328 api.py:567] Block input depth: 112 output depth: 192 I0401 11:17:13.776208 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.788218 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_16/IdentityN:0 shape: (8, 40, 40, 672) I0401 11:17:13.812892 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.825079 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_16/IdentityN_1:0 shape: (8, 20, 20, 672) I0401 11:17:13.854262 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 672) I0401 11:17:13.878287 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.889096 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_16/Identity_2:0 shape: (8, 20, 20, 192) I0401 11:17:13.897731 140447485675328 api.py:567] block_17 survival_prob: 0.8521739130434782 I0401 11:17:13.905825 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_16/Identity_2:0 shape: (8, 20, 20, 192) I0401 11:17:13.913451 140447485675328 api.py:567] Block input depth: 192 output depth: 192 I0401 11:17:13.940433 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.952698 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_17/IdentityN:0 shape: (8, 20, 20, 1152) I0401 11:17:13.979357 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:13.991640 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_17/IdentityN_1:0 shape: (8, 20, 20, 1152) I0401 11:17:14.021620 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 1152) I0401 11:17:14.046188 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.062845 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_17/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.071480 140447485675328 api.py:567] block_18 survival_prob: 0.8434782608695652 I0401 11:17:14.079218 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_17/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.087102 140447485675328 api.py:567] Block input depth: 192 output depth: 192 I0401 11:17:14.114317 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.126622 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_18/IdentityN:0 shape: (8, 20, 20, 1152) I0401 11:17:14.153470 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.165656 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_18/IdentityN_1:0 shape: (8, 20, 20, 1152) I0401 11:17:14.195769 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 1152) I0401 11:17:14.219893 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.236497 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_18/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.245516 140447485675328 api.py:567] block_19 survival_prob: 0.8347826086956522 I0401 11:17:14.253276 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_18/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.260800 140447485675328 api.py:567] Block input depth: 192 output depth: 192 I0401 11:17:14.288321 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.299984 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_19/IdentityN:0 shape: (8, 20, 20, 1152) I0401 11:17:14.327781 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.339581 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_19/IdentityN_1:0 shape: (8, 20, 20, 1152) I0401 11:17:14.370139 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 1152) I0401 11:17:14.395300 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.411996 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_19/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.420647 140447485675328 api.py:567] block_20 survival_prob: 0.8260869565217391 I0401 11:17:14.428865 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_19/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.436446 140447485675328 api.py:567] Block input depth: 192 output depth: 192 I0401 11:17:14.463333 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.475573 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_20/IdentityN:0 shape: (8, 20, 20, 1152) I0401 11:17:14.502600 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.514838 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_20/IdentityN_1:0 shape: (8, 20, 20, 1152) I0401 11:17:14.545521 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 1152) I0401 11:17:14.570740 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.587347 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_20/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.596017 140447485675328 api.py:567] block_21 survival_prob: 0.8173913043478261 I0401 11:17:14.604045 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_20/Add_1:0 shape: (8, 20, 20, 192) I0401 11:17:14.611952 140447485675328 api.py:567] Block input depth: 192 output depth: 320 I0401 11:17:14.639219 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.651601 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_21/IdentityN:0 shape: (8, 20, 20, 1152) I0401 11:17:14.678556 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.690697 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_21/IdentityN_1:0 shape: (8, 20, 20, 1152) I0401 11:17:14.721195 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 1152) I0401 11:17:14.745552 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.756181 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_21/Identity_2:0 shape: (8, 20, 20, 320) I0401 11:17:14.764995 140447485675328 api.py:567] block_22 survival_prob: 0.808695652173913 I0401 11:17:14.772978 140447485675328 api.py:567] Block input: efficientnet-b1/model/blocks_21/Identity_2:0 shape: (8, 20, 20, 320) I0401 11:17:14.780588 140447485675328 api.py:567] Block input depth: 320 output depth: 320 I0401 11:17:14.807712 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.819810 140447485675328 api.py:567] Expand: efficientnet-b1/model/blocks_22/IdentityN:0 shape: (8, 20, 20, 1920) I0401 11:17:14.848059 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.859940 140447485675328 api.py:567] DWConv: efficientnet-b1/model/blocks_22/IdentityN_1:0 shape: (8, 20, 20, 1920) I0401 11:17:14.890838 140447485675328 api.py:567] Built Squeeze and Excitation with tensor shape: (8, 1, 1, 1920) I0401 11:17:14.916351 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:14.933193 140447485675328 api.py:567] Project: efficientnet-b1/model/blocks_22/Add_1:0 shape: (8, 20, 20, 320) WARNING:tensorflow:From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/profiler/internal/flops_registry.py:142: tensor_shape_from_node_def_name (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use tf.compat.v1.graph_util.tensor_shape_from_node_def_name W0401 11:17:14.945716 140447485675328 deprecation.py:323] From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/profiler/internal/flops_registry.py:142: tensor_shape_from_node_def_name (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use tf.compat.v1.graph_util.tensor_shape_from_node_def_name Parsing Inputs... I0401 11:17:15.446661 140447485675328 efficientdet_arch.py:553] backbone params/flops = 6.101024M, 75.647456291B WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/efficientdet_arch.py:94: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras.layers.Conv2D instead. W0401 11:17:15.447114 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/efficientdet_arch.py:94: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras.layers.Conv2D instead. WARNING:tensorflow:From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.__call__ method instead. W0401 11:17:15.447479 140447485675328 deprecation.py:323] From /home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.__call__ method instead. I0401 11:17:15.488196 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/efficientdet_arch.py:121: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.MaxPooling2D instead. W0401 11:17:15.494092 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/efficientdet_arch.py:121: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.MaxPooling2D instead. I0401 11:17:15.496819 140447485675328 efficientdet_arch.py:382] building cell 0 I0401 11:17:15.497030 140447485675328 efficientdet_arch.py:462] fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]} WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/efficientdet_arch.py:516: separable_conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras.layers.SeparableConv2D instead. W0401 11:17:15.505959 140447485675328 deprecation.py:323] From /data/internet/AI/0329/automl/efficientdet/efficientdet_arch.py:516: separable_conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras.layers.SeparableConv2D instead. I0401 11:17:15.532225 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.538249 140447485675328 efficientdet_arch.py:462] fnode 1 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 5]} I0401 11:17:15.558930 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.598629 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.604592 140447485675328 efficientdet_arch.py:462] fnode 2 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 6]} I0401 11:17:15.628824 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.669443 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.675552 140447485675328 efficientdet_arch.py:462] fnode 3 : {'width_ratio': 0.125, 'inputs_offsets': [0, 7]} I0401 11:17:15.697623 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.737598 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.744125 140447485675328 efficientdet_arch.py:462] fnode 4 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 7, 8]} I0401 11:17:15.807749 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.850601 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.856630 140447485675328 efficientdet_arch.py:462] fnode 5 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 6, 9]} I0401 11:17:15.877015 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.920547 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.926713 140447485675328 efficientdet_arch.py:462] fnode 6 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 5, 10]} I0401 11:17:15.963577 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:15.969686 140447485675328 efficientdet_arch.py:462] fnode 7 : {'width_ratio': 0.0078125, 'inputs_offsets': [4, 11]} I0401 11:17:16.004859 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.012500 140447485675328 efficientdet_arch.py:382] building cell 1 I0401 11:17:16.012775 140447485675328 efficientdet_arch.py:462] fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]} I0401 11:17:16.047744 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.053829 140447485675328 efficientdet_arch.py:462] fnode 1 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 5]} I0401 11:17:16.087985 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.094059 140447485675328 efficientdet_arch.py:462] fnode 2 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 6]} I0401 11:17:16.128473 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.134548 140447485675328 efficientdet_arch.py:462] fnode 3 : {'width_ratio': 0.125, 'inputs_offsets': [0, 7]} I0401 11:17:16.168931 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.174973 140447485675328 efficientdet_arch.py:462] fnode 4 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 7, 8]} I0401 11:17:16.211962 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.218474 140447485675328 efficientdet_arch.py:462] fnode 5 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 6, 9]} I0401 11:17:16.255624 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.261641 140447485675328 efficientdet_arch.py:462] fnode 6 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 5, 10]} I0401 11:17:16.298952 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.305600 140447485675328 efficientdet_arch.py:462] fnode 7 : {'width_ratio': 0.0078125, 'inputs_offsets': [4, 11]} I0401 11:17:16.339295 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.345551 140447485675328 efficientdet_arch.py:382] building cell 2 I0401 11:17:16.345786 140447485675328 efficientdet_arch.py:462] fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]} I0401 11:17:16.380450 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.386518 140447485675328 efficientdet_arch.py:462] fnode 1 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 5]} I0401 11:17:16.421286 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.427789 140447485675328 efficientdet_arch.py:462] fnode 2 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 6]} I0401 11:17:16.462440 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.468494 140447485675328 efficientdet_arch.py:462] fnode 3 : {'width_ratio': 0.125, 'inputs_offsets': [0, 7]} I0401 11:17:16.503071 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.509163 140447485675328 efficientdet_arch.py:462] fnode 4 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 7, 8]} I0401 11:17:16.546271 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.552673 140447485675328 efficientdet_arch.py:462] fnode 5 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 6, 9]} I0401 11:17:16.591290 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.597355 140447485675328 efficientdet_arch.py:462] fnode 6 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 5, 10]} I0401 11:17:16.634736 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.641310 140447485675328 efficientdet_arch.py:462] fnode 7 : {'width_ratio': 0.0078125, 'inputs_offsets': [4, 11]} I0401 11:17:16.677104 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.683398 140447485675328 efficientdet_arch.py:382] building cell 3 I0401 11:17:16.683621 140447485675328 efficientdet_arch.py:462] fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]} I0401 11:17:16.718342 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.724524 140447485675328 efficientdet_arch.py:462] fnode 1 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 5]} I0401 11:17:16.760164 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.766257 140447485675328 efficientdet_arch.py:462] fnode 2 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 6]} I0401 11:17:16.801625 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.807684 140447485675328 efficientdet_arch.py:462] fnode 3 : {'width_ratio': 0.125, 'inputs_offsets': [0, 7]} I0401 11:17:16.843065 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.849432 140447485675328 efficientdet_arch.py:462] fnode 4 : {'width_ratio': 0.0625, 'inputs_offsets': [1, 7, 8]} I0401 11:17:16.887324 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.893537 140447485675328 efficientdet_arch.py:462] fnode 5 : {'width_ratio': 0.03125, 'inputs_offsets': [2, 6, 9]} I0401 11:17:16.932371 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.938457 140447485675328 efficientdet_arch.py:462] fnode 6 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 5, 10]} I0401 11:17:16.976171 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:16.982384 140447485675328 efficientdet_arch.py:462] fnode 7 : {'width_ratio': 0.0078125, 'inputs_offsets': [4, 11]} I0401 11:17:17.017593 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 Parsing Inputs... I0401 11:17:17.790601 140447485675328 efficientdet_arch.py:558] backbone+fpn params/flops = 6.491996M, 83.359466117B I0401 11:17:17.816360 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:17.850417 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:17.883690 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:17.922672 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:17.948365 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:17.974542 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.006680 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.032862 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.058590 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.090218 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.116093 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.142098 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.173929 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.199928 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.225402 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.266666 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.299158 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.332480 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.371457 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.397468 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.423716 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.455471 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.482109 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.508380 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.540969 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.566870 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.592897 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.624942 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.651163 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 I0401 11:17:18.677570 140447485675328 utils.py:194] TpuBatchNormalization with num_shards_per_group 1 Parsing Inputs... I0401 11:17:19.618858 140447485675328 efficientdet_arch.py:563] backbone+fpn+box params/flops = 6.625898M, 101.026141253B I0401 11:17:19.621832 140447485675328 det_model_fn.py:95] LR schedule method: cosine I0401 11:17:19.774527 140447485675328 utils.py:319] Adding summary ('lrn_rate', <tf.Tensor 'Select:0' shape=() dtype=float32>) I0401 11:17:19.775084 140447485675328 utils.py:319] Adding summary ('trainloss/cls_loss', <tf.Tensor 'AddN:0' shape=() dtype=float32>) I0401 11:17:19.775584 140447485675328 utils.py:319] Adding summary ('trainloss/box_loss', <tf.Tensor 'AddN_1:0' shape=() dtype=float32>) I0401 11:17:19.776080 140447485675328 utils.py:319] Adding summary ('trainloss/det_loss', <tf.Tensor 'add_3:0' shape=() dtype=float32>) I0401 11:17:19.776625 140447485675328 utils.py:319] Adding summary ('trainloss/l2_loss', <tf.Tensor 'mul_14:0' shape=() dtype=float32>) I0401 11:17:19.777127 140447485675328 utils.py:319] Adding summary ('trainloss/loss', <tf.Tensor 'add_4:0' shape=() dtype=float32>) I0401 11:17:19.778305 140447485675328 det_model_fn.py:463] clip gradients norm by 10.000000 I0401 11:17:26.399553 140447485675328 utils.py:319] Adding summary ('gnorm', <tf.Tensor 'clip/global_norm/global_norm:0' shape=() dtype=float32>) WARNING:tensorflow:From /data/internet/AI/0329/automl/efficientdet/det_model_fn.py:581: The name tf.estimator.tpu.TPUEstimatorSpec is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimatorSpec instead.

W0401 11:17:36.114968 140447485675328 module_wrapper.py:138] From /data/internet/AI/0329/automl/efficientdet/det_model_fn.py:581: The name tf.estimator.tpu.TPUEstimatorSpec is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimatorSpec instead.

I0401 11:17:36.115184 140447485675328 utils.py:325] get summaries [('lrn_rate', <tf.Tensor 'Mean:0' shape=() dtype=float32>), ('trainloss/cls_loss', <tf.Tensor 'Mean_1:0' shape=() dtype=float32>), ('trainloss/box_loss', <tf.Tensor 'Mean_2:0' shape=() dtype=float32>), ('trainloss/det_loss', <tf.Tensor 'Mean_3:0' shape=() dtype=float32>), ('trainloss/l2_loss', <tf.Tensor 'Mean_4:0' shape=() dtype=float32>), ('trainloss/loss', <tf.Tensor 'Mean_5:0' shape=() dtype=float32>), ('gnorm', <tf.Tensor 'clip/Mean:0' shape=() dtype=float32>)] INFO:tensorflow:training_loop marked as finished I0401 11:17:36.120511 140447485675328 error_handling.py:108] training_loop marked as finished WARNING:tensorflow:Reraising captured error W0401 11:17:36.120587 140447485675328 error_handling.py:142] Reraising captured error Traceback (most recent call last): File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3032, in train rendezvous.record_error('training_loop', sys.exc_info()) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 81, in record_error if value and value.op and value.op.type == _CHECK_NUMERIC_OP_NAME: AttributeError: 'ValueError' object has no attribute 'op'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main.py", line 390, in tf.app.run(main) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/hjq/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run _run_main(main, args) File "/home/hjq/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "main.py", line 255, in main FLAGS.train_batch_size)) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train rendezvous.raise_errors() File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 143, in raise_errors six.reraise(typ, value, traceback) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train saving_listeners=saving_listeners) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1194, in _train_model_default features, labels, ModeKeys.TRAIN, self.config) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2857, in _call_model_fn config) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1152, in _call_model_fn model_fn_results = self._model_fn(features=features, *kwargs) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3126, in _model_fn features, labels, is_export_mode=is_export_mode) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 1663, in call_without_tpu return self._call_model_fn(features, labels, is_export_mode=is_export_mode) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2005, in _call_model_fn return estimator_spec.as_estimator_spec() File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 379, in as_estimator_spec host_call_ret = _OutfeedHostCall.create_cpu_hostcall(host_calls) File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2076, in create_cpu_hostcall ret[name] = host_fn(tensors) File "/data/internet/AI/0329/automl/efficientdet/utils.py", line 343, in host_call_fn model_dir, max_queue=iterations_per_loop).as_default(): File "/home/hjq/anaconda3/envs/efficientDet/lib/python3.6/site-packages/tensorflow_core/python/ops/summary_ops_v2.py", line 371, in create_file_writer_v2 raise ValueError("logdir cannot be None") ValueError: logdir cannot be None

I set my own category and report this error. How can I correct it? python main.py --training_file_pattern=tfrecrod_gstrain/train* --model_name=efficientdet-d0 --model_dir=ckpt/gs --ckpt=efficientdet-d0 --num_epochs=15 --hparams="use_bfloat16=false,num_classes=3" --use_tpu=False

ValueError: Shape of variable class_net/class-predict/bias:0 ((27,)) doesn't match with shape of tensor class_net/class-predict/bias ([810]) from checkpoint reader.

liminghuiv commented 4 years ago

I have trained successfully on my custom dataset with 2 classes on a sinigle P40 GPU using efficientdet-d4. My training command is:

CUDA_VISIBLE_DEVICES=0 python main.py --training_file_pattern=train.tfrecord --model_name=efficientdet-d4 --model_dir=experiments/model_dir/ --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=3500 --num_epochs=2 --ckpt=/root/data/checkpoints/efficientdet/efficientdet-d4 --validation_file_pattern=valid.tfrecord --eval_after_training=True --train_batch_size=2

I use tensorflow=1.15.0, and I replaced all the import tensorflow.compat.v1 as tf with import tensorflow as tf.

My data is labeled as VOC format, and converted to tfrecord by this script: script, BUT I changed its line 129~130 to: 'image/source_id': dataset_util.bytes_feature("".encode('utf8')),, since the efficientdet code will convert the source_id to integer.

After training for 2 epochs, I have got a reasonable mAP. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.783 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.975 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.889 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.832 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.822 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.836 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.840 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.840 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000 @HaoyuanPeng, I tried @CraigWang1 's data: Here is my dataset of about 1900 512x380 images that I'm training on with one 'bin' class (586 MB). I got the exact issue. Can you share your training/val tfrecord? so we can have a try.

HaoyuanPeng commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

cyx6666 commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

Have you tried running this command in tutorial.ipynb to predict new images? import inference tf.reset_default_graph() driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size) driver.inference(img_path, img_out_dir, min_score_thresh=min_score_thresh, max_boxes_to_draw=max_boxes_to_draw, line_thickness=line_thickness)

Everything works fine when i run !python main.py --training_file_pattern=pascal_train.record --model_name=efficientdet-d4 --model_dir=trained_model --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=300 --num_epochs=1 --ckpt=efficientdet-d4 --validation_file_pattern=pascal_val.record --eval_after_training=True --train_batch_size=1 --logtostderr but when I used the trained model to predict the picture,it appeared InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] [[strided_slice_8/_2717]] (1) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] 0 successful operations. 0 derived errors ignored.

HaoyuanPeng commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

Have you tried running this command in tutorial.ipynb to predict new images? import inference tf.reset_default_graph() driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size) driver.inference(img_path, img_out_dir, min_score_thresh=min_score_thresh, max_boxes_to_draw=max_boxes_to_draw, line_thickness=line_thickness)

Everything works fine when i run !python main.py --training_file_pattern=pascal_train.record --model_name=efficientdet-d4 --model_dir=trained_model --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=300 --num_epochs=1 --ckpt=efficientdet-d4 --validation_file_pattern=pascal_val.record --eval_after_training=True --train_batch_size=1 --logtostderr but when I used the trained model to predict the picture,it appeared InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] [[strided_slice_8/_2717]] (1) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] 0 successful operations. 0 derived errors ignored.

You cannot directly use the inferece.InferenceDriver,because the num_classes is fixed. You should modify the code in the inference driver to add the actual num_classes in self.params, or there will be an error during loading the model.

watertianyi commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

Have you tried running this command in tutorial.ipynb to predict new images? import inference tf.reset_default_graph() driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size) driver.inference(img_path, img_out_dir, min_score_thresh=min_score_thresh, max_boxes_to_draw=max_boxes_to_draw, line_thickness=line_thickness)

Everything works fine when i run !python main.py --training_file_pattern=pascal_train.record --model_name=efficientdet-d4 --model_dir=trained_model --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=300 --num_epochs=1 --ckpt=efficientdet-d4 --validation_file_pattern=pascal_val.record --eval_after_training=True --train_batch_size=1 --logtostderr but when I used the trained model to predict the picture,it appeared InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] [[strided_slice_8/_2717]] (1) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] 0 successful operations. 0 derived errors ignored.

Yes, I use my own image data to reason. The result is the same as yours. Now I don't know how to solve it

cyx6666 commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

Have you tried running this command in tutorial.ipynb to predict new images? import inference tf.reset_default_graph() driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size) driver.inference(img_path, img_out_dir, min_score_thresh=min_score_thresh, max_boxes_to_draw=max_boxes_to_draw, line_thickness=line_thickness) Everything works fine when i run !python main.py --training_file_pattern=pascal_train.record --model_name=efficientdet-d4 --model_dir=trained_model --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=300 --num_epochs=1 --ckpt=efficientdet-d4 --validation_file_pattern=pascal_val.record --eval_after_training=True --train_batch_size=1 --logtostderr but when I used the trained model to predict the picture,it appeared InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] [[strided_slice_8/_2717]] (1) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] 0 successful operations. 0 derived errors ignored.

You cannot directly use the inferece.InferenceDriver,because the num_classes is fixed. You should modify the code in the inference driver to add the actual num_classes in self.params, or there will be an error during loading the model.

Hmm,how to add the num_classes in self.params? I tried to modify the num_classes in hparams_config.py,but it didn't work.

HaoyuanPeng commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

Have you tried running this command in tutorial.ipynb to predict new images? import inference tf.reset_default_graph() driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size) driver.inference(img_path, img_out_dir, min_score_thresh=min_score_thresh, max_boxes_to_draw=max_boxes_to_draw, line_thickness=line_thickness) Everything works fine when i run !python main.py --training_file_pattern=pascal_train.record --model_name=efficientdet-d4 --model_dir=trained_model --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=300 --num_epochs=1 --ckpt=efficientdet-d4 --validation_file_pattern=pascal_val.record --eval_after_training=True --train_batch_size=1 --logtostderr but when I used the trained model to predict the picture,it appeared InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] [[strided_slice_8/_2717]] (1) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] 0 successful operations. 0 derived errors ignored.

You cannot directly use the inferece.InferenceDriver,because the num_classes is fixed. You should modify the code in the inference driver to add the actual num_classes in self.params, or there will be an error during loading the model.

Hmm,how to add the num_classes in self.params? I tried to modify the num_classes in hparams_config.py,but it didn't work.

modify the __init__ function in inference.py, add a parameter num_classes, then self.params.update(dict(num_classes=num_classes))

cyx6666 commented 4 years ago

@honghande You should add var_exclude_expr=r'.*/class-predict/.* in the hparams in your running command. Because the pre-trained model is trained with 90 classes, and your custom data has only 3 classes. The shapes of the layers for predicting the class id are different, and they should be excluded during loading the pre-trained model.

Have you tried running this command in tutorial.ipynb to predict new images? import inference tf.reset_default_graph() driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size) driver.inference(img_path, img_out_dir, min_score_thresh=min_score_thresh, max_boxes_to_draw=max_boxes_to_draw, line_thickness=line_thickness) Everything works fine when i run !python main.py --training_file_pattern=pascal_train.record --model_name=efficientdet-d4 --model_dir=trained_model --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=300 --num_epochs=1 --ckpt=efficientdet-d4 --validation_file_pattern=pascal_val.record --eval_after_training=True --train_batch_size=1 --logtostderr but when I used the trained model to predict the picture,it appeared InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] [[strided_slice_8/_2717]] (1) Invalid argument: Input to reshape is a tensor with 373248 values, but the requested shape requires a multiple of 90 [[node Reshape (defined at /content/drive/My Drive/automl/efficientdet/det_model_fn.py:293) ]] 0 successful operations. 0 derived errors ignored.

You cannot directly use the inferece.InferenceDriver,because the num_classes is fixed. You should modify the code in the inference driver to add the actual num_classes in self.params, or there will be an error during loading the model.

Hmm,how to add the num_classes in self.params? I tried to modify the num_classes in hparams_config.py,but it didn't work.

modify the __init__ function in inference.py, add a parameter num_classes, then self.params.update(dict(num_classes=num_classes))

OK,it works well finally! Thanks for your suggestion!

qtw1998 commented 4 years ago

I have trained successfully on my custom dataset with 2 classes on a sinigle P40 GPU using efficientdet-d4. My training command is:

CUDA_VISIBLE_DEVICES=0 python main.py --training_file_pattern=train.tfrecord --model_name=efficientdet-d4 --model_dir=experiments/model_dir/ --hparams="use_bfloat16=false,num_classes=2,var_exclude_expr=r'.*/class-predict/.*'" --use_tpu=False --num_examples_per_epoch=3500 --num_epochs=2 --ckpt=/root/data/checkpoints/efficientdet/efficientdet-d4 --validation_file_pattern=valid.tfrecord --eval_after_training=True --train_batch_size=2

I use tensorflow=1.15.0, and I replaced all the import tensorflow.compat.v1 as tf with import tensorflow as tf.

My data is labeled as VOC format, and converted to tfrecord by this script: script, BUT I changed its line 129~130 to: 'image/source_id': dataset_util.bytes_feature("".encode('utf8')),, since the efficientdet code will convert the source_id to integer. (Not exactly running this script, but borrowing its dict_to_tf_example function.

After training for 2 epochs, I have got a reasonable mAP. Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.783 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.975 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.889 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.832 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.822 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.836 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.840 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.840 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

Hi bro! could you help me find the reason why I always get the ZERO of all the indexes even when I try to eval the pretrained ckpt? It's my command, and I have convert my XML to COCO TFRECORD AND JSON by the create_pascal_tfrecord.py CUDA_VISIBLE_DEVICES=2 python main.py --mode=eval --model_name=efficientdet-d1 --model_dir=efficientdet-d1/ --validation_file_pattern=/cluster/home/qiaotianwei/data/bdd100k2coco/val_all/*.tfrecord --val_json_file=/cluster/home/qiaotianwei/data/bdd100k2coco/val_all/json_val.json --hparams="use_bfloat16=false, num_classes=90" --use_tpu=False

Hoping for your helping hands!!! image

mingxingtan commented 4 years ago

After a few recent changs (especially https://github.com/google/automl/commit/6048346978639ed982ca1193f4b36366cd5ea2a8 and https://github.com/google/automl/commit/39bbf8d273eef1b5ec24e5ecaae907f1435313fc), there should be no mAP=0 issue.

I am going to close this issue, but if there are still issues, feel free to reopen it.

ancorasir commented 4 years ago

@CraigWang1 , Hi, just wondering have you successfully trained your dataset with reasonable inference?

CraigWang1 commented 4 years ago

@ancorasir Hi! No, I have switched over to a different implementation that is easier to work with for me.

fitoule commented 4 years ago

with the new "no json mode" but directly take values in tfrecord : I have the current bug : AP is still 1

CraigWang1 commented 4 years ago

So I know it's been a while, but I finally came back and fixed the issue! Thanks to everyone who helped and discussed this issue including @HaoyuanPeng and @mingxingtan; this post is for future users who meet the same problem.

Before, it turns out I was generating tfrecords incorrectly, below is the way that worked for me.

My raw data is formatted in pascal format, labelled with labelImg.

I changed directory into automl/efficientdet, and I used this code to generate tf records (slightly edited from this repository's create_pascal_tfrecord.py).

# TF RECORD

r"""Convert PASCAL dataset to TFRecord.
Example usage:
    python create_pascal_tfrecord.py  --data_dir=/tmp/VOCdevkit  \
        --year=VOC2012  --output_path=/tmp/pascal
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import hashlib
import io
import json
import logging
import os

from lxml import etree
import PIL.Image
import tensorflow.compat.v1 as tf

from dataset import tfrecord_util

# flags = tf.app.flags
# flags.DEFINE_string('data_dir', '', 'Root directory to raw PASCAL VOC dataset.')
# flags.DEFINE_string('set', 'train', 'Convert training set, validation set or '
#                     'merged set.')
# flags.DEFINE_string('annotations_dir', 'Annotations',
#                     '(Relative) path to annotations directory.')
# flags.DEFINE_string('year', 'VOC2007', 'Desired challenge year.')
# flags.DEFINE_string('output_path', '', 'Path to output TFRecord and json.')
# flags.DEFINE_string('label_map_json_path', None,
#                     'Path to label map json file with a dictionary.')
# flags.DEFINE_boolean('ignore_difficult_instances', False, 'Whether to ignore '
#                      'difficult instances')
# flags.DEFINE_integer('num_shards', 100, 'Number of shards for output file.')
# flags.DEFINE_integer('num_images', None, 'Max number of imags to process.')
# FLAGS = flags.FLAGS

GLOBAL_IMG_ID = 0  # global image id.
GLOBAL_ANN_ID = 0  # global annotation id.

def get_image_id(filename):
  """Convert a string to a integer."""
  # Warning: this function is highly specific to pascal filename!!
  # Given filename like '2008_000002', we cannot use id 2008000002 because our
  # code internally will convert the int value to float32 and back to int, which
  # would cause value mismatch int(float32(2008000002)) != int(2008000002).
  # COCO needs int values, here we just use a incremental global_id, but
  # users should customize their own ways to generate filename.
  img_id = int(os.path.splitext(os.path.basename(filename))[0])   #eg. '/home/craig/0.png' -> int(0)
  return img_id

def get_ann_id():
  """Return unique annotation id across images."""
  global GLOBAL_ANN_ID
  GLOBAL_ANN_ID += 1
  return GLOBAL_ANN_ID

def dict_to_tf_example(data,
                       dataset_directory,
                       label_map_dict,
                       ignore_difficult_instances=False,
                       image_subdirectory='JPEGImages',
                       ann_json_dict=None):
  """Convert XML derived dict to tf.Example proto.
  Notice that this function normalizes the bounding box coordinates provided
  by the raw data.
  Args:
    data: dict holding PASCAL XML fields for a single image (obtained by
      running tfrecord_util.recursive_parse_xml_to_dict)
    dataset_directory: Path to root directory holding PASCAL dataset
    label_map_dict: A map from string label names to integers ids.
    ignore_difficult_instances: Whether to skip difficult instances in the
      dataset  (default: False).
    image_subdirectory: String specifying subdirectory within the
      PASCAL dataset directory holding the actual image data.
    ann_json_dict: annotation json dictionary.
  Returns:
    example: The converted tf.Example.
  Raises:
    ValueError: if the image pointed to by data['filename'] is not a valid JPEG
  """
  #img_path = os.path.join(data['folder'], image_subdirectory, data['filename'])
  img_path = data['path']
  full_path = os.path.join(dataset_directory, img_path)
  with tf.gfile.GFile(full_path, 'rb') as fid:
    encoded_jpg = fid.read()
  encoded_jpg_io = io.BytesIO(encoded_jpg)
  image = PIL.Image.open(encoded_jpg_io)
  # if image.format != 'JPEG':
  #   raise ValueError('Image format not JPEG')
  key = hashlib.sha256(encoded_jpg).hexdigest()

  width = int(data['size']['width'])
  height = int(data['size']['height'])
  image_id = get_image_id(data['filename'])
  if ann_json_dict:
    image = {
        'file_name': data['filename'],
        'height': height,
        'width': width,
        'id': image_id,
    }
    ann_json_dict['images'].append(image)

  xmin = []
  ymin = []
  xmax = []
  ymax = []
  classes = []
  classes_text = []
  truncated = []
  poses = []
  difficult_obj = []
  if 'object' in data:
    for obj in data['object']:
      difficult = bool(int(obj['difficult']))
      if ignore_difficult_instances and difficult:
        continue

      difficult_obj.append(int(difficult))

      xmin.append(float(obj['bndbox']['xmin']) / width)
      ymin.append(float(obj['bndbox']['ymin']) / height)
      xmax.append(float(obj['bndbox']['xmax']) / width)
      ymax.append(float(obj['bndbox']['ymax']) / height)
      classes_text.append(obj['name'].encode('utf8'))
      classes.append(label_map_dict[obj['name']])
      truncated.append(int(obj['truncated']))
      poses.append(obj['pose'].encode('utf8'))

      if ann_json_dict:
        abs_xmin = int(obj['bndbox']['xmin'])
        abs_ymin = int(obj['bndbox']['ymin'])
        abs_xmax = int(obj['bndbox']['xmax'])
        abs_ymax = int(obj['bndbox']['ymax'])
        abs_width = abs_xmax - abs_xmin
        abs_height = abs_ymax - abs_ymin
        ann = {
            'area': abs_width * abs_height,
            'iscrowd': 0,
            'image_id': image_id,
            'bbox': [abs_xmin, abs_ymin, abs_width, abs_height],
            'category_id': label_map_dict[obj['name']],
            'id': get_ann_id(),
            'ignore': 0,
            'segmentation': [],
        }
        ann_json_dict['annotations'].append(ann)

  example = tf.train.Example(features=tf.train.Features(feature={
      'image/height': tfrecord_util.int64_feature(height),
      'image/width': tfrecord_util.int64_feature(width),
      'image/filename': tfrecord_util.bytes_feature(
          data['filename'].encode('utf8')),
      'image/source_id': tfrecord_util.bytes_feature(
          str(image_id).encode('utf8')),
      'image/key/sha256': tfrecord_util.bytes_feature(key.encode('utf8')),
      'image/encoded': tfrecord_util.bytes_feature(encoded_jpg),
      'image/format': tfrecord_util.bytes_feature('jpeg'.encode('utf8')),
      'image/object/bbox/xmin': tfrecord_util.float_list_feature(xmin),
      'image/object/bbox/xmax': tfrecord_util.float_list_feature(xmax),
      'image/object/bbox/ymin': tfrecord_util.float_list_feature(ymin),
      'image/object/bbox/ymax': tfrecord_util.float_list_feature(ymax),
      'image/object/class/text': tfrecord_util.bytes_list_feature(classes_text),
      'image/object/class/label': tfrecord_util.int64_list_feature(classes),
      'image/object/difficult': tfrecord_util.int64_list_feature(difficult_obj),
      'image/object/truncated': tfrecord_util.int64_list_feature(truncated),
      'image/object/view': tfrecord_util.bytes_list_feature(poses),
  }))
  return example

def pascal_tfrecord(FLAGS_data_dir, FLAGS_set, FLAGS_annotations_dir, FLAGS_year, FLAGS_output_path,
                    FLAGS_label_map_json_path, FLAGS_ignore_difficult_instances = False, FLAGS_num_shards=1,
                    FLAGS_num_images = None):
  if FLAGS_set not in SETS:
    raise ValueError('set must be in : {}'.format(SETS))
  if FLAGS_year not in YEARS:
    raise ValueError('year must be in : {}'.format(YEARS))
  if not FLAGS_output_path:
    raise ValueError('output_path cannot be empty.')
  if not os.path.exists(FLAGS_output_path):
    os.makedirs(FLAGS_output_path)

  data_dir = FLAGS_data_dir
  years = ['VOC2007', 'VOC2012']
  if FLAGS_year != 'merged':
    years = [FLAGS_year]

  logging.info('writing to output path: %s', FLAGS_output_path)
  writers = [
      tf.python_io.TFRecordWriter(
          FLAGS_output_path + '-%05d-of-%05d.tfrecord' % (i, FLAGS_num_shards))
      for i in range(FLAGS_num_shards)
  ]

  if FLAGS_label_map_json_path:
    with tf.io.gfile.GFile(FLAGS_label_map_json_path, 'rb') as f:
      label_map_dict = json.load(f)
  else:
    label_map_dict = pascal_label_map_dict

  ann_json_dict = {
      'images': [],
      'type': 'instances',
      'annotations': [],
      'categories': []
  }
  for year in years:
    for class_name, class_id in label_map_dict.items():
      cls = {'supercategory': 'none', 'id': class_id, 'name': class_name}
      ann_json_dict['categories'].append(cls)

    logging.info('Reading from PASCAL %s dataset.', year)
    examples_path = os.path.join(data_dir, year, 'ImageSets', 'Main', FLAGS_set + '.txt')      #I CHANGED THIS
    annotations_dir = os.path.join(data_dir, year, FLAGS_annotations_dir)
    examples_list = tfrecord_util.read_examples_list(examples_path)
    for idx, example in enumerate(examples_list):
      if FLAGS_num_images and idx >= FLAGS_num_images:
        break
      if idx % 100 == 0:
        logging.info('On image %d of %d', idx, len(examples_list))
      path = os.path.join(annotations_dir, example + '.xml')
      with tf.gfile.GFile(path, 'r') as fid:
        xml_str = fid.read()
      xml = etree.fromstring(xml_str)
      data = tfrecord_util.recursive_parse_xml_to_dict(xml)['annotation']

      tf_example = dict_to_tf_example(data, FLAGS_data_dir, label_map_dict,
                                      FLAGS_ignore_difficult_instances,
                                      ann_json_dict=ann_json_dict)
      writers[idx % FLAGS_num_shards].write(tf_example.SerializeToString())

  for writer in writers:
    writer.close()

  json_file_path = os.path.join(
      os.path.dirname(FLAGS_output_path),
      'json_' + os.path.basename(FLAGS_output_path) + '.json')
  with tf.io.gfile.GFile(json_file_path, 'w') as f:
    json.dump(ann_json_dict, f)

SETS = ['train', 'val', 'trainval', 'test']
YEARS = ['VOC2007', 'VOC2012', 'merged']

pascal_label_map_dict = {
    'background': 0, 'gate': 1
}

Execution:

# create tfrecords (classes set up above in 'pascal_label_map_dict')

# trainval set
pascal_tfrecord(
    FLAGS_data_dir='/content/formatted_dset/data/VOCdevkit',
    FLAGS_set='trainval',
    FLAGS_annotations_dir='Annotations',
    FLAGS_year='VOC2007',
    FLAGS_output_path='tfrecord/train',
    FLAGS_label_map_json_path=None,
    FLAGS_num_shards=1,
)

# val set
pascal_tfrecord(
    FLAGS_data_dir='/content/formatted_dset/data/VOCdevkit',
    FLAGS_set='val',
    FLAGS_annotations_dir='Annotations',
    FLAGS_year='VOC2007',
    FLAGS_output_path='tfrecord/val',
    FLAGS_label_map_json_path=None,
    FLAGS_num_shards=1,
)

Then, I downloaded the COCO pretrained ckpt to finetune on my custom dataset.

# get pretrained weights
MODEL = 'efficientdet-d0'  #@param
# Download checkpoint.
!wget https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco/{MODEL}.tar.gz
!tar zxf {MODEL}.tar.gz

Finally, I trained using the following command:

# train using efficientdet pretrained weights
!python main.py \
  --mode=train_and_eval \
  --training_file_pattern=tfrecord/train-* \
  --validation_file_pattern=tfrecord/val-* \
  --val_json_file=tfrecord/json_val.json \
  --model_name=efficientdet-d0 \
  --model_dir=models \
  --ckpt=efficientdet-d0 \
  --train_batch_size=8 \
  --eval_batch_size=8 --eval_samples=1024 \
  --num_examples_per_epoch=5717 --num_epochs=3 \
  --hparams="num_classes=1,moving_average_decay=0" \
  --use_tpu=False

Eval mAP:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.743
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.958
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.932
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.673
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.775
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.776
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.794
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.796
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.750
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.819

Thanks again to everyone in this issue, and I hope that this will help future users with the same issue.

HyunjiEllenPak commented 4 years ago

@CraigWang1 I am struggling with poor training results on my custom dataset from coco pretrained model.

So, I wonder if you change any hyper-parameters setting such as learning rate/train_scale_min/train_scale_max to get the good results.

And did you train on your custom datasets for a few epochs(= 3 epoch)? is it enough?

CraigWang1 commented 4 years ago

@HyunjiEllenPak

No, I did not change any hyperparameters.

I trained on my custom dataset for 7 epochs later, but I found that the mAP had not increased from the 3rd epoch.

What is your mAP? Have you tried finetuning from pretrained weights?

Also, I found xuannianz's efficientdet (a 3rd party implementation) works well, though you may have to use an earlier version of their repo (around March 2020, I have one forked if you need it).

HyunjiEllenPak commented 4 years ago

@CraigWang1

yes, I tried finetuning from pretrained weights. But I failed to get good results.

In my case,..

After training for 6 epochs from coco pretrained weights, the validation AP was 0.01. And the parameters that i used are learning rate=0.001 and lr_warmup_init=0.0001 and default values for others. The reason I used small learning rate is when i used large learning rate like 0.04 and 0.08, the validation AP didn't increase even if train loss decreased.

CraigWang1 commented 4 years ago

How many images are you training on?

How are you generating tf records?

Can you post your train command, with your tensorboard graphs?

kunaljain0 commented 3 years ago

So I know it's been a while, but I finally came back and fixed the issue! Thanks to everyone who helped and discussed this issue including @HaoyuanPeng and @mingxingtan; this post is for future users who meet the same problem.

Before, it turns out I was generating tfrecords incorrectly, below is the way that worked for me.

My raw data is formatted in pascal format, labelled with labelImg.

I changed directory into automl/efficientdet, and I used this code to generate tf records (slightly edited from this repository's create_pascal_tfrecord.py).

# TF RECORD

r"""Convert PASCAL dataset to TFRecord.
Example usage:
    python create_pascal_tfrecord.py  --data_dir=/tmp/VOCdevkit  \
        --year=VOC2012  --output_path=/tmp/pascal
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import hashlib
import io
import json
import logging
import os

from lxml import etree
import PIL.Image
import tensorflow.compat.v1 as tf

from dataset import tfrecord_util

# flags = tf.app.flags
# flags.DEFINE_string('data_dir', '', 'Root directory to raw PASCAL VOC dataset.')
# flags.DEFINE_string('set', 'train', 'Convert training set, validation set or '
#                     'merged set.')
# flags.DEFINE_string('annotations_dir', 'Annotations',
#                     '(Relative) path to annotations directory.')
# flags.DEFINE_string('year', 'VOC2007', 'Desired challenge year.')
# flags.DEFINE_string('output_path', '', 'Path to output TFRecord and json.')
# flags.DEFINE_string('label_map_json_path', None,
#                     'Path to label map json file with a dictionary.')
# flags.DEFINE_boolean('ignore_difficult_instances', False, 'Whether to ignore '
#                      'difficult instances')
# flags.DEFINE_integer('num_shards', 100, 'Number of shards for output file.')
# flags.DEFINE_integer('num_images', None, 'Max number of imags to process.')
# FLAGS = flags.FLAGS

GLOBAL_IMG_ID = 0  # global image id.
GLOBAL_ANN_ID = 0  # global annotation id.

def get_image_id(filename):
  """Convert a string to a integer."""
  # Warning: this function is highly specific to pascal filename!!
  # Given filename like '2008_000002', we cannot use id 2008000002 because our
  # code internally will convert the int value to float32 and back to int, which
  # would cause value mismatch int(float32(2008000002)) != int(2008000002).
  # COCO needs int values, here we just use a incremental global_id, but
  # users should customize their own ways to generate filename.
  img_id = int(os.path.splitext(os.path.basename(filename))[0])   #eg. '/home/craig/0.png' -> int(0)
  return img_id

def get_ann_id():
  """Return unique annotation id across images."""
  global GLOBAL_ANN_ID
  GLOBAL_ANN_ID += 1
  return GLOBAL_ANN_ID

def dict_to_tf_example(data,
                       dataset_directory,
                       label_map_dict,
                       ignore_difficult_instances=False,
                       image_subdirectory='JPEGImages',
                       ann_json_dict=None):
  """Convert XML derived dict to tf.Example proto.
  Notice that this function normalizes the bounding box coordinates provided
  by the raw data.
  Args:
    data: dict holding PASCAL XML fields for a single image (obtained by
      running tfrecord_util.recursive_parse_xml_to_dict)
    dataset_directory: Path to root directory holding PASCAL dataset
    label_map_dict: A map from string label names to integers ids.
    ignore_difficult_instances: Whether to skip difficult instances in the
      dataset  (default: False).
    image_subdirectory: String specifying subdirectory within the
      PASCAL dataset directory holding the actual image data.
    ann_json_dict: annotation json dictionary.
  Returns:
    example: The converted tf.Example.
  Raises:
    ValueError: if the image pointed to by data['filename'] is not a valid JPEG
  """
  #img_path = os.path.join(data['folder'], image_subdirectory, data['filename'])
  img_path = data['path']
  full_path = os.path.join(dataset_directory, img_path)
  with tf.gfile.GFile(full_path, 'rb') as fid:
    encoded_jpg = fid.read()
  encoded_jpg_io = io.BytesIO(encoded_jpg)
  image = PIL.Image.open(encoded_jpg_io)
  # if image.format != 'JPEG':
  #   raise ValueError('Image format not JPEG')
  key = hashlib.sha256(encoded_jpg).hexdigest()

  width = int(data['size']['width'])
  height = int(data['size']['height'])
  image_id = get_image_id(data['filename'])
  if ann_json_dict:
    image = {
        'file_name': data['filename'],
        'height': height,
        'width': width,
        'id': image_id,
    }
    ann_json_dict['images'].append(image)

  xmin = []
  ymin = []
  xmax = []
  ymax = []
  classes = []
  classes_text = []
  truncated = []
  poses = []
  difficult_obj = []
  if 'object' in data:
    for obj in data['object']:
      difficult = bool(int(obj['difficult']))
      if ignore_difficult_instances and difficult:
        continue

      difficult_obj.append(int(difficult))

      xmin.append(float(obj['bndbox']['xmin']) / width)
      ymin.append(float(obj['bndbox']['ymin']) / height)
      xmax.append(float(obj['bndbox']['xmax']) / width)
      ymax.append(float(obj['bndbox']['ymax']) / height)
      classes_text.append(obj['name'].encode('utf8'))
      classes.append(label_map_dict[obj['name']])
      truncated.append(int(obj['truncated']))
      poses.append(obj['pose'].encode('utf8'))

      if ann_json_dict:
        abs_xmin = int(obj['bndbox']['xmin'])
        abs_ymin = int(obj['bndbox']['ymin'])
        abs_xmax = int(obj['bndbox']['xmax'])
        abs_ymax = int(obj['bndbox']['ymax'])
        abs_width = abs_xmax - abs_xmin
        abs_height = abs_ymax - abs_ymin
        ann = {
            'area': abs_width * abs_height,
            'iscrowd': 0,
            'image_id': image_id,
            'bbox': [abs_xmin, abs_ymin, abs_width, abs_height],
            'category_id': label_map_dict[obj['name']],
            'id': get_ann_id(),
            'ignore': 0,
            'segmentation': [],
        }
        ann_json_dict['annotations'].append(ann)

  example = tf.train.Example(features=tf.train.Features(feature={
      'image/height': tfrecord_util.int64_feature(height),
      'image/width': tfrecord_util.int64_feature(width),
      'image/filename': tfrecord_util.bytes_feature(
          data['filename'].encode('utf8')),
      'image/source_id': tfrecord_util.bytes_feature(
          str(image_id).encode('utf8')),
      'image/key/sha256': tfrecord_util.bytes_feature(key.encode('utf8')),
      'image/encoded': tfrecord_util.bytes_feature(encoded_jpg),
      'image/format': tfrecord_util.bytes_feature('jpeg'.encode('utf8')),
      'image/object/bbox/xmin': tfrecord_util.float_list_feature(xmin),
      'image/object/bbox/xmax': tfrecord_util.float_list_feature(xmax),
      'image/object/bbox/ymin': tfrecord_util.float_list_feature(ymin),
      'image/object/bbox/ymax': tfrecord_util.float_list_feature(ymax),
      'image/object/class/text': tfrecord_util.bytes_list_feature(classes_text),
      'image/object/class/label': tfrecord_util.int64_list_feature(classes),
      'image/object/difficult': tfrecord_util.int64_list_feature(difficult_obj),
      'image/object/truncated': tfrecord_util.int64_list_feature(truncated),
      'image/object/view': tfrecord_util.bytes_list_feature(poses),
  }))
  return example

def pascal_tfrecord(FLAGS_data_dir, FLAGS_set, FLAGS_annotations_dir, FLAGS_year, FLAGS_output_path,
                    FLAGS_label_map_json_path, FLAGS_ignore_difficult_instances = False, FLAGS_num_shards=1,
                    FLAGS_num_images = None):
  if FLAGS_set not in SETS:
    raise ValueError('set must be in : {}'.format(SETS))
  if FLAGS_year not in YEARS:
    raise ValueError('year must be in : {}'.format(YEARS))
  if not FLAGS_output_path:
    raise ValueError('output_path cannot be empty.')
  if not os.path.exists(FLAGS_output_path):
    os.makedirs(FLAGS_output_path)

  data_dir = FLAGS_data_dir
  years = ['VOC2007', 'VOC2012']
  if FLAGS_year != 'merged':
    years = [FLAGS_year]

  logging.info('writing to output path: %s', FLAGS_output_path)
  writers = [
      tf.python_io.TFRecordWriter(
          FLAGS_output_path + '-%05d-of-%05d.tfrecord' % (i, FLAGS_num_shards))
      for i in range(FLAGS_num_shards)
  ]

  if FLAGS_label_map_json_path:
    with tf.io.gfile.GFile(FLAGS_label_map_json_path, 'rb') as f:
      label_map_dict = json.load(f)
  else:
    label_map_dict = pascal_label_map_dict

  ann_json_dict = {
      'images': [],
      'type': 'instances',
      'annotations': [],
      'categories': []
  }
  for year in years:
    for class_name, class_id in label_map_dict.items():
      cls = {'supercategory': 'none', 'id': class_id, 'name': class_name}
      ann_json_dict['categories'].append(cls)

    logging.info('Reading from PASCAL %s dataset.', year)
    examples_path = os.path.join(data_dir, year, 'ImageSets', 'Main', FLAGS_set + '.txt')      #I CHANGED THIS
    annotations_dir = os.path.join(data_dir, year, FLAGS_annotations_dir)
    examples_list = tfrecord_util.read_examples_list(examples_path)
    for idx, example in enumerate(examples_list):
      if FLAGS_num_images and idx >= FLAGS_num_images:
        break
      if idx % 100 == 0:
        logging.info('On image %d of %d', idx, len(examples_list))
      path = os.path.join(annotations_dir, example + '.xml')
      with tf.gfile.GFile(path, 'r') as fid:
        xml_str = fid.read()
      xml = etree.fromstring(xml_str)
      data = tfrecord_util.recursive_parse_xml_to_dict(xml)['annotation']

      tf_example = dict_to_tf_example(data, FLAGS_data_dir, label_map_dict,
                                      FLAGS_ignore_difficult_instances,
                                      ann_json_dict=ann_json_dict)
      writers[idx % FLAGS_num_shards].write(tf_example.SerializeToString())

  for writer in writers:
    writer.close()

  json_file_path = os.path.join(
      os.path.dirname(FLAGS_output_path),
      'json_' + os.path.basename(FLAGS_output_path) + '.json')
  with tf.io.gfile.GFile(json_file_path, 'w') as f:
    json.dump(ann_json_dict, f)

SETS = ['train', 'val', 'trainval', 'test']
YEARS = ['VOC2007', 'VOC2012', 'merged']

pascal_label_map_dict = {
    'background': 0, 'gate': 1
}

Execution:

# create tfrecords (classes set up above in 'pascal_label_map_dict')

# trainval set
pascal_tfrecord(
    FLAGS_data_dir='/content/formatted_dset/data/VOCdevkit',
    FLAGS_set='trainval',
    FLAGS_annotations_dir='Annotations',
    FLAGS_year='VOC2007',
    FLAGS_output_path='tfrecord/train',
    FLAGS_label_map_json_path=None,
    FLAGS_num_shards=1,
)

# val set
pascal_tfrecord(
    FLAGS_data_dir='/content/formatted_dset/data/VOCdevkit',
    FLAGS_set='val',
    FLAGS_annotations_dir='Annotations',
    FLAGS_year='VOC2007',
    FLAGS_output_path='tfrecord/val',
    FLAGS_label_map_json_path=None,
    FLAGS_num_shards=1,
)

Then, I downloaded the COCO pretrained ckpt to finetune on my custom dataset.

# get pretrained weights
MODEL = 'efficientdet-d0'  #@param
# Download checkpoint.
!wget https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco/{MODEL}.tar.gz
!tar zxf {MODEL}.tar.gz

Finally, I trained using the following command:

# train using efficientdet pretrained weights
!python main.py \
  --mode=train_and_eval \
  --training_file_pattern=tfrecord/train-* \
  --validation_file_pattern=tfrecord/val-* \
  --val_json_file=tfrecord/json_val.json \
  --model_name=efficientdet-d0 \
  --model_dir=models \
  --ckpt=efficientdet-d0 \
  --train_batch_size=8 \
  --eval_batch_size=8 --eval_samples=1024 \
  --num_examples_per_epoch=5717 --num_epochs=3 \
  --hparams="num_classes=1,moving_average_decay=0" \
  --use_tpu=False

Eval mAP:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.743
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.958
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.932
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.673
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.775
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.776
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.794
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.796
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.750
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.819

Thanks again to everyone in this issue, and I hope that this will help future users with the same issue.

Hi CriagWang1, I had my data in voc2012 format and converted csv to tfrecord using https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py script. What does json_val.json look like? Is it necessary to add the val_json_file if coco checkpoints are used? I don't have it for now.

kartik4949 commented 3 years ago

@kunaljain0 @CraigWang1 visualize your tfrecords with config file using https://github.com/google/automl/blob/master/efficientdet/dataset/inspect_tfrecords.py with command

python dataset/inspect_tfrecords.py --file_pattern dataset/sample.record\ 
--model_name "efficientdet-d0" --samples 10\ 
--save_samples_dir train_samples/  -hparams="label_map={1:'label1'}, autoaugmentation_policy=v3"