google / automl

Google Brain AutoML
Apache License 2.0
6.19k stars 1.45k forks source link

Per-class AP for specific classes and new datasets #837

Open JaydenKing32 opened 3 years ago

JaydenKing32 commented 3 years ago

Hello, I've been trying to get per-class AP results for evaluation, but I can't figure out how to make it work when evaluating a new dataset or setting specific classes with the COCO 2017 dataset.

I found out via e03aaee that I could change h.label_map = None to h.label_map = 'coco' in hparams_config.py in order to produce per-class AP results for all 90 COCO classes when evaluating the COCO 2017 dataset. However, when I tried to specify classes using this method I ended up with an "Incompatible shapes" error, (full log is included below). This attempt was made by setting h.num_classes = 3 and h.label_map = {1: 'person', 2: 'dog'} with the COCO 2017 dataset, but I would like to be able to specify classes for other datasets as well.

Is there some other method of specifying classes that I am not aware of, or is this currently not possible? I've also seen label_map being set via the hparams option, but only for training. I've attempted to use hparams when evaluating with a pre-trained model, but this also resulted in an error.

Additionally, I believe this may be related to f701a0b and #762 which also seem to involve label_map.

Log ``` > python main.py --mode=eval --model_name=efficientdet-d0 --model_dir=models/efficientdet-d0/ --validation_file_pattern=tfrecord/val* --val_json_file=data/annotations/instances_val2017.json 2020-10-27 04:14:09.954175: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2020-10-27 04:14:09.954425: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. I1027 04:14:15.212893 297028 main.py:228] {'name': 'efficientdet-d0', 'act_type': 'swish', 'image_size': (512, 512), 'target_size': None, 'input_rand_hflip': True, 'jitter_min': 0.1, 'jitter_max': 2.0, 'autoaugment_policy': None, 'use_augmix': False, 'grid_mask': False, 'augmix_params': [3, -1, 1], 'sample_image': None, 'num_classes': 3, 'seg_num_classes': 3, 'heads': ['object_detection'], 'skip_crowd_during_training': True, 'label_map': {1: 'person', 2: 'dog'}, 'max_instances_per_image': 100, 'regenerate_source_id': False, 'min_level': 3, 'max_level': 7, 'num_scales': 3, 'aspect_ratios': [1.0, 2.0, 0.5], 'anchor_scale': 4.0, 'is_training_bn': True, 'momentum': 0.9, 'optimizer': 'sgd', 'learning_rate': 0.08, 'lr_warmup_init': 0.008, 'lr_warmup_epoch': 1.0, 'first_lr_drop_epoch': 200.0, 'second_lr_drop_epoch': 250.0, 'poly_lr_power': 0.9, 'clip_gradients_norm': 10.0, 'num_epochs': 300, 'data_format': 'channels_last', 'label_smoothing': 0.0, 'alpha': 0.25, 'gamma': 1.5, 'delta': 0.1, 'box_loss_weight': 50.0, 'iou_loss_type': None, 'iou_loss_weight': 1.0, 'weight_decay': 4e-05, 'strategy': None, 'mixed_precision': False, 'box_class_repeats': 3, 'fpn_cell_repeats': 3, 'fpn_num_filters': 64, 'separable_conv': True, 'apply_bn_for_resampling': True, 'conv_after_downsample': False, 'conv_bn_act_pattern': False, 'drop_remainder': True, 'nms_configs': {'method': 'gaussian', 'iou_thresh': None, 'score_thresh': None, 'sigma': None, 'max_nms_inputs': 0, 'max_output_size': 100}, 'fpn_name': None, 'fpn_weight_method': None, 'fpn_config': None, 'survival_prob': None, 'img_summary_steps': None, 'lr_decay_method': 'cosine', 'moving_average_decay': 0.9998, 'ckpt_var_scope': None, 'skip_mismatch': True, 'backbone_name': 'efficientnet-b0', 'backbone_config': None, 'var_freeze_expr': None, 'use_keras_model': True, 'dataset_type': None, 'positives_momentum': None, 'device': {'grad_ckpting': False, 'grad_ckpting_list': ['Add_', 'AddN'], 'nvgpu_logging': False}, 'model_name': 'efficientdet-d0', 'iterations_per_loop': 100, 'model_dir': 'models/efficientdet-d0/', 'num_shards': 8, 'num_examples_per_epoch': 120000, 'backbone_ckpt': '', 'ckpt': None, 'val_json_file': 'data/annotations/instances_val2017.json', 'testdev_dir': None, 'profile': False, 'mode': 'eval'} INFO:tensorflow:Using config: {'_model_dir': 'models/efficientdet-d0/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} INFO:tensorflow:Waiting for new checkpoint at models/efficientdet-d0/ I1027 04:14:15.285697 297028 checkpoint_utils.py:125] Waiting for new checkpoint at models/efficientdet-d0/ INFO:tensorflow:Found new checkpoint at models/efficientdet-d0/model I1027 04:14:15.291682 297028 checkpoint_utils.py:134] Found new checkpoint at models/efficientdet-d0/model I1027 04:14:15.293684 297028 main.py:308] Starting to evaluate. 2020-10-27 04:14:15.446721: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll 2020-10-27 04:14:15.477384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 970 computeCapability: 5.2 coreClock: 1.342GHz coreCount: 13 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 211.48GiB/s 2020-10-27 04:14:15.480269: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2020-10-27 04:14:15.483283: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found 2020-10-27 04:14:15.491425: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found 2020-10-27 04:14:15.494261: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found 2020-10-27 04:14:15.496957: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2020-10-27 04:14:15.499897: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found 2020-10-27 04:14:15.507223: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found 2020-10-27 04:14:15.507373: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... INFO:tensorflow:Calling model_fn. I1027 04:14:15.834230 297028 estimator.py:1162] Calling model_fn. I1027 04:14:15.843206 297028 efficientnet_builder.py:215] global_params= GlobalParams(batch_norm_momentum=0.99, batch_norm_epsilon=0.001, dropout_rate=0.2, data_format='channels_last', num_classes=1000, width_coefficient=1.0, depth_coefficient=1.0, depth_divisor=8, min_depth=None, survival_prob=0.0, relu_fn=functools.partial(, act_type='swish'), batch_norm=, use_se=True, local_pooling=None, condconv_num_experts=None, clip_projection_output=False, blocks_args=['r1_k3_s11_e1_i32_o16_se0.25', 'r2_k3_s22_e6_i16_o24_se0.25', 'r2_k5_s22_e6_i24_o40_se0.25', 'r3_k3_s22_e6_i40_o80_se0.25', 'r3_k5_s11_e6_i80_o112_se0.25', 'r4_k5_s22_e6_i112_o192_se0.25', 'r1_k3_s11_e6_i192_o320_se0.25'], fix_head_stem=None) I1027 04:14:16.045665 297028 efficientdet_keras.py:682] fnode 0 : {'feat_level': 6, 'inputs_offsets': [3, 4]} I1027 04:14:16.046664 297028 efficientdet_keras.py:682] fnode 1 : {'feat_level': 5, 'inputs_offsets': [2, 5]} I1027 04:14:16.051657 297028 efficientdet_keras.py:682] fnode 2 : {'feat_level': 4, 'inputs_offsets': [1, 6]} I1027 04:14:16.052662 297028 efficientdet_keras.py:682] fnode 3 : {'feat_level': 3, 'inputs_offsets': [0, 7]} I1027 04:14:16.053644 297028 efficientdet_keras.py:682] fnode 4 : {'feat_level': 4, 'inputs_offsets': [1, 7, 8]} I1027 04:14:16.054655 297028 efficientdet_keras.py:682] fnode 5 : {'feat_level': 5, 'inputs_offsets': [2, 6, 9]} I1027 04:14:16.055639 297028 efficientdet_keras.py:682] fnode 6 : {'feat_level': 6, 'inputs_offsets': [3, 5, 10]} I1027 04:14:16.056636 297028 efficientdet_keras.py:682] fnode 7 : {'feat_level': 7, 'inputs_offsets': [4, 11]} I1027 04:14:16.057633 297028 efficientdet_keras.py:682] fnode 0 : {'feat_level': 6, 'inputs_offsets': [3, 4]} I1027 04:14:16.058638 297028 efficientdet_keras.py:682] fnode 1 : {'feat_level': 5, 'inputs_offsets': [2, 5]} I1027 04:14:16.058638 297028 efficientdet_keras.py:682] fnode 2 : {'feat_level': 4, 'inputs_offsets': [1, 6]} I1027 04:14:16.059628 297028 efficientdet_keras.py:682] fnode 3 : {'feat_level': 3, 'inputs_offsets': [0, 7]} I1027 04:14:16.060626 297028 efficientdet_keras.py:682] fnode 4 : {'feat_level': 4, 'inputs_offsets': [1, 7, 8]} I1027 04:14:16.069602 297028 efficientdet_keras.py:682] fnode 5 : {'feat_level': 5, 'inputs_offsets': [2, 6, 9]} I1027 04:14:16.070600 297028 efficientdet_keras.py:682] fnode 6 : {'feat_level': 6, 'inputs_offsets': [3, 5, 10]} I1027 04:14:16.071597 297028 efficientdet_keras.py:682] fnode 7 : {'feat_level': 7, 'inputs_offsets': [4, 11]} I1027 04:14:16.072605 297028 efficientdet_keras.py:682] fnode 0 : {'feat_level': 6, 'inputs_offsets': [3, 4]} I1027 04:14:16.073591 297028 efficientdet_keras.py:682] fnode 1 : {'feat_level': 5, 'inputs_offsets': [2, 5]} I1027 04:14:16.074589 297028 efficientdet_keras.py:682] fnode 2 : {'feat_level': 4, 'inputs_offsets': [1, 6]} I1027 04:14:16.075586 297028 efficientdet_keras.py:682] fnode 3 : {'feat_level': 3, 'inputs_offsets': [0, 7]} I1027 04:14:16.076583 297028 efficientdet_keras.py:682] fnode 4 : {'feat_level': 4, 'inputs_offsets': [1, 7, 8]} I1027 04:14:16.077580 297028 efficientdet_keras.py:682] fnode 5 : {'feat_level': 5, 'inputs_offsets': [2, 6, 9]} I1027 04:14:16.078589 297028 efficientdet_keras.py:682] fnode 6 : {'feat_level': 6, 'inputs_offsets': [3, 5, 10]} I1027 04:14:16.079575 297028 efficientdet_keras.py:682] fnode 7 : {'feat_level': 7, 'inputs_offsets': [4, 11]} I1027 04:14:16.146396 297028 efficientnet_model.py:717] Built stem stem : (1, 256, 256, 32) I1027 04:14:16.146396 297028 efficientnet_model.py:372] Block blocks_0 input shape: (1, 256, 256, 32) I1027 04:14:16.167340 297028 efficientnet_model.py:391] DWConv shape: (1, 256, 256, 32) I1027 04:14:16.185292 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 32) I1027 04:14:16.202273 297028 efficientnet_model.py:412] Project shape: (1, 256, 256, 16) I1027 04:14:16.203259 297028 efficientnet_model.py:372] Block blocks_1 input shape: (1, 256, 256, 16) I1027 04:14:16.220200 297028 efficientnet_model.py:388] Expand shape: (1, 256, 256, 96) I1027 04:14:16.239148 297028 efficientnet_model.py:391] DWConv shape: (1, 128, 128, 96) I1027 04:14:16.258098 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 96) I1027 04:14:16.275053 297028 efficientnet_model.py:412] Project shape: (1, 128, 128, 24) I1027 04:14:16.276050 297028 efficientnet_model.py:372] Block blocks_2 input shape: (1, 128, 128, 24) I1027 04:14:16.294999 297028 efficientnet_model.py:388] Expand shape: (1, 128, 128, 144) I1027 04:14:16.312950 297028 efficientnet_model.py:391] DWConv shape: (1, 128, 128, 144) I1027 04:14:16.331900 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 144) I1027 04:14:16.350851 297028 efficientnet_model.py:412] Project shape: (1, 128, 128, 24) I1027 04:14:16.351847 297028 efficientnet_model.py:372] Block blocks_3 input shape: (1, 128, 128, 24) I1027 04:14:16.369799 297028 efficientnet_model.py:388] Expand shape: (1, 128, 128, 144) I1027 04:14:16.388749 297028 efficientnet_model.py:391] DWConv shape: (1, 64, 64, 144) I1027 04:14:16.408722 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 144) I1027 04:14:16.460557 297028 efficientnet_model.py:412] Project shape: (1, 64, 64, 40) I1027 04:14:16.461556 297028 efficientnet_model.py:372] Block blocks_4 input shape: (1, 64, 64, 40) I1027 04:14:16.482501 297028 efficientnet_model.py:388] Expand shape: (1, 64, 64, 240) I1027 04:14:16.526380 297028 efficientnet_model.py:391] DWConv shape: (1, 64, 64, 240) I1027 04:14:16.550316 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 240) I1027 04:14:16.567303 297028 efficientnet_model.py:412] Project shape: (1, 64, 64, 40) I1027 04:14:16.567303 297028 efficientnet_model.py:372] Block blocks_5 input shape: (1, 64, 64, 40) I1027 04:14:16.585255 297028 efficientnet_model.py:388] Expand shape: (1, 64, 64, 240) I1027 04:14:16.604172 297028 efficientnet_model.py:391] DWConv shape: (1, 32, 32, 240) I1027 04:14:16.622126 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 240) I1027 04:14:16.640078 297028 efficientnet_model.py:412] Project shape: (1, 32, 32, 80) I1027 04:14:16.640078 297028 efficientnet_model.py:372] Block blocks_6 input shape: (1, 32, 32, 80) I1027 04:14:16.659026 297028 efficientnet_model.py:388] Expand shape: (1, 32, 32, 480) I1027 04:14:16.678999 297028 efficientnet_model.py:391] DWConv shape: (1, 32, 32, 480) I1027 04:14:16.697923 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 480) I1027 04:14:16.714877 297028 efficientnet_model.py:412] Project shape: (1, 32, 32, 80) I1027 04:14:16.715881 297028 efficientnet_model.py:372] Block blocks_7 input shape: (1, 32, 32, 80) I1027 04:14:16.734824 297028 efficientnet_model.py:388] Expand shape: (1, 32, 32, 480) I1027 04:14:16.753796 297028 efficientnet_model.py:391] DWConv shape: (1, 32, 32, 480) I1027 04:14:16.771724 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 480) I1027 04:14:16.789704 297028 efficientnet_model.py:412] Project shape: (1, 32, 32, 80) I1027 04:14:16.790675 297028 efficientnet_model.py:372] Block blocks_8 input shape: (1, 32, 32, 80) I1027 04:14:16.808653 297028 efficientnet_model.py:388] Expand shape: (1, 32, 32, 480) I1027 04:14:16.827607 297028 efficientnet_model.py:391] DWConv shape: (1, 32, 32, 480) I1027 04:14:16.848520 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 480) I1027 04:14:16.865475 297028 efficientnet_model.py:412] Project shape: (1, 32, 32, 112) I1027 04:14:16.866473 297028 efficientnet_model.py:372] Block blocks_9 input shape: (1, 32, 32, 112) I1027 04:14:16.884424 297028 efficientnet_model.py:388] Expand shape: (1, 32, 32, 672) I1027 04:14:16.903407 297028 efficientnet_model.py:391] DWConv shape: (1, 32, 32, 672) I1027 04:14:16.921351 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 672) I1027 04:14:16.938279 297028 efficientnet_model.py:412] Project shape: (1, 32, 32, 112) I1027 04:14:16.939277 297028 efficientnet_model.py:372] Block blocks_10 input shape: (1, 32, 32, 112) I1027 04:14:16.959223 297028 efficientnet_model.py:388] Expand shape: (1, 32, 32, 672) I1027 04:14:16.981192 297028 efficientnet_model.py:391] DWConv shape: (1, 32, 32, 672) I1027 04:14:17.000142 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 672) I1027 04:14:17.019090 297028 efficientnet_model.py:412] Project shape: (1, 32, 32, 112) I1027 04:14:17.020061 297028 efficientnet_model.py:372] Block blocks_11 input shape: (1, 32, 32, 112) I1027 04:14:17.039030 297028 efficientnet_model.py:388] Expand shape: (1, 32, 32, 672) I1027 04:14:17.057961 297028 efficientnet_model.py:391] DWConv shape: (1, 16, 16, 672) I1027 04:14:17.075913 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 672) I1027 04:14:17.092867 297028 efficientnet_model.py:412] Project shape: (1, 16, 16, 192) I1027 04:14:17.093865 297028 efficientnet_model.py:372] Block blocks_12 input shape: (1, 16, 16, 192) I1027 04:14:17.117801 297028 efficientnet_model.py:388] Expand shape: (1, 16, 16, 1152) I1027 04:14:17.138745 297028 efficientnet_model.py:391] DWConv shape: (1, 16, 16, 1152) I1027 04:14:17.159688 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 1152) I1027 04:14:17.176643 297028 efficientnet_model.py:412] Project shape: (1, 16, 16, 192) I1027 04:14:17.177640 297028 efficientnet_model.py:372] Block blocks_13 input shape: (1, 16, 16, 192) I1027 04:14:17.199582 297028 efficientnet_model.py:388] Expand shape: (1, 16, 16, 1152) I1027 04:14:17.221523 297028 efficientnet_model.py:391] DWConv shape: (1, 16, 16, 1152) I1027 04:14:17.240498 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 1152) I1027 04:14:17.258444 297028 efficientnet_model.py:412] Project shape: (1, 16, 16, 192) I1027 04:14:17.259428 297028 efficientnet_model.py:372] Block blocks_14 input shape: (1, 16, 16, 192) I1027 04:14:17.280385 297028 efficientnet_model.py:388] Expand shape: (1, 16, 16, 1152) I1027 04:14:17.302334 297028 efficientnet_model.py:391] DWConv shape: (1, 16, 16, 1152) I1027 04:14:17.327241 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 1152) I1027 04:14:17.344195 297028 efficientnet_model.py:412] Project shape: (1, 16, 16, 192) I1027 04:14:17.345194 297028 efficientnet_model.py:372] Block blocks_15 input shape: (1, 16, 16, 192) I1027 04:14:17.367160 297028 efficientnet_model.py:388] Expand shape: (1, 16, 16, 1152) I1027 04:14:17.389108 297028 efficientnet_model.py:391] DWConv shape: (1, 16, 16, 1152) I1027 04:14:17.409054 297028 efficientnet_model.py:195] Built SE se : (1, 1, 1, 1152) I1027 04:14:17.426975 297028 efficientnet_model.py:412] Project shape: (1, 16, 16, 320) I1027 04:14:19.243146 297028 det_model_fn.py:76] LR schedule method: cosine I1027 04:14:19.479487 297028 postprocess.py:85] use max_nms_inputs for pre-nms topk. I1027 04:14:19.560272 297028 det_model_fn.py:520] Eval val with groudtruths data/annotations/instances_val2017.json. I1027 04:14:19.592214 297028 det_model_fn.py:597] Load EMA vars with ema_decay=0.999800 INFO:tensorflow:Done calling model_fn. I1027 04:14:20.001094 297028 estimator.py:1164] Done calling model_fn. INFO:tensorflow:Starting evaluation at 2020-10-27T04:14:20Z I1027 04:14:20.022037 297028 evaluation.py:255] Starting evaluation at 2020-10-27T04:14:20Z INFO:tensorflow:Graph was finalized. I1027 04:14:20.349184 297028 monitored_session.py:246] Graph was finalized. 2020-10-27 04:14:20.354594: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-10-27 04:14:20.363610: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1cb45701160 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-10-27 04:14:20.363752: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-10-27 04:14:20.364676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 970 computeCapability: 5.2 coreClock: 1.342GHz coreCount: 13 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 211.48GiB/s 2020-10-27 04:14:20.369303: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2020-10-27 04:14:20.372036: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found 2020-10-27 04:14:20.374767: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found 2020-10-27 04:14:20.377572: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found 2020-10-27 04:14:20.387685: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2020-10-27 04:14:20.390543: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found 2020-10-27 04:14:20.393392: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found 2020-10-27 04:14:20.393568: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2020-10-27 04:14:20.450548: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-10-27 04:14:20.450736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-10-27 04:14:20.456690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-10-27 04:14:20.460772: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1cb476cfa80 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-10-27 04:14:20.460954: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 970, Compute Capability 5.2 INFO:tensorflow:Restoring parameters from models/efficientdet-d0/model I1027 04:14:20.465852 297028 saver.py:1293] Restoring parameters from models/efficientdet-d0/model INFO:tensorflow:Running local_init_op. I1027 04:14:21.500088 297028 session_manager.py:505] Running local_init_op. INFO:tensorflow:Done running local_init_op. I1027 04:14:21.577889 297028 session_manager.py:508] Done running local_init_op. 2020-10-27 04:14:23.831451: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at gather_nd_op.cc:47 : Invalid argument: indices[0,4999] = [0, 467040] does not index into param shape [1,49104,4] Traceback (most recent call last): File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 1365, in _do_call return fn(*args) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,8,8,27] vs. [1,8,8,810] [[{{node focal_loss_3/mul}}]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "main.py", line 365, in app.run(main) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\absl\app.py", line 300, in run _run_main(main, args) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\absl\app.py", line 251, in _run_main sys.exit(main(argv)) File "main.py", line 310, in main eval_results = eval_est.evaluate(eval_input_fn, steps=eval_steps) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 467, in evaluate name=name) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 510, in _actual_eval return _evaluate() File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 499, in _evaluate output_dir=self.eval_dir(name)) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1644, in _evaluate_run config=self._session_config) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\evaluation.py", line 272, in _evaluate_once session.run(eval_ops, feed_dict) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 778, in run run_metadata=run_metadata) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1283, in run run_metadata=run_metadata) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1384, in run raise six.reraise(*original_exc_info) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\six.py", line 703, in reraise raise value File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1369, in run return self._sess.run(*args, **kwargs) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1442, in run run_metadata=run_metadata) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1200, in run return self._sess.run(*args, **kwargs) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 958, in run run_metadata_ptr) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 1181, in _run feed_dict_tensor, options, run_metadata) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 1359, in _do_run run_metadata) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,8,8,27] vs. [1,8,8,810] [[node focal_loss_3/mul (defined at E:\User\Documents\automl\efficientdet\det_model_fn.py:152) ]] Errors may have originated from an input operation. Input Source operations connected to node focal_loss_3/mul: Reshape_6 (defined at E:\User\Documents\automl\efficientdet\det_model_fn.py:249) focal_loss_3/Sigmoid (defined at E:\User\Documents\automl\efficientdet\det_model_fn.py:151) Original stack trace for 'focal_loss_3/mul': File "main.py", line 365, in app.run(main) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\absl\app.py", line 300, in run _run_main(main, args) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\absl\app.py", line 251, in _run_main sys.exit(main(argv)) File "main.py", line 310, in main eval_results = eval_est.evaluate(eval_input_fn, steps=eval_steps) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 467, in evaluate name=name) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 510, in _actual_eval return _evaluate() File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 492, in _evaluate self._evaluate_build_graph(input_fn, hooks, checkpoint_path)) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1528, in _evaluate_build_graph self._call_model_fn_eval(input_fn, self.config)) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1564, in _call_model_fn_eval config) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1163, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "E:\User\Documents\automl\efficientdet\det_model_fn.py", line 676, in efficientdet_model_fn variable_filter_fn=variable_filter_fn) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 302, in wrapper return func(*args, **kwargs) File "E:\User\Documents\automl\efficientdet\det_model_fn.py", line 383, in _model_fn cls_outputs, box_outputs, labels, params) File "E:\User\Documents\automl\efficientdet\det_model_fn.py", line 258, in detection_loss label_smoothing=params['label_smoothing']) File "E:\User\Documents\automl\efficientdet\det_model_fn.py", line 152, in focal_loss p_t = (y_true * pred_prob) + ((1 - y_true) * (1 - pred_prob)) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1124, in binary_op_wrapper return func(x, y, name=name) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1456, in _mul_dispatch return multiply(x, y, name=name) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\util\dispatch.py", line 201, in wrapper return target(*args, **kwargs) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\ops\math_ops.py", line 508, in multiply return gen_math_ops.mul(x, y, name) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6176, in mul "Mul", x=x, y=y, name=name) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 744, in _apply_op_helper attrs=attr_protos, op_def=op_def) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 3485, in _create_op_internal op_def=op_def) File "E:\User\Documents\automl\efficientdet\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 1949, in __init__ self._traceback = tf_stack.extract_stack() ```
JaydenKing32 commented 3 years ago

Ah, it seems as though changing h.num_classes was the problem. If I set h.label_map = {1: 'person', 2: 'dog'} and leave h.num_classes alone, then it seems to work fine.

I suppose I fixed my issue, though it doesn't feel like my method is the intended solution.

matankley commented 3 years ago

@JaydenKing32 Can you please describe the method that worked for you ? Keeping h.num_classes as coco default and only change h.label_map to your custom dataset ? Thanks

JaydenKing32 commented 3 years ago

@matankley Yes, I didn't change the number of classes, it was left as h.num_classes = 90.

I only changed the label map to h.label_map = {1: 'person', 2: 'dog'}, and the per-class AP scores for "person" and "dog" were produced alongside the other results.

alexlorenzo commented 3 years ago

Hello @JaydenKing32 , did you find a cleaner solution for h.num_classes = 90? We have the same issue.

Thanks