tensorflow / models

Models and examples built with TensorFlow
Other
77.01k stars 45.78k forks source link

TFODAPI error on training #7021

Closed dloperab closed 5 years ago

dloperab commented 5 years ago

System information

Describe the problem

I want to train a "faster_rcnn_inception_v2" model with my custom dataset. When I execute the command to train the next messages are received:

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

WARNING:tensorflow:Forced number of epochs for all eval validations to be 1. INFO:tensorflow:Maybe overwriting train_steps: 10000 INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1 INFO:tensorflow:Maybe overwriting use_bfloat16: False INFO:tensorflow:Maybe overwriting eval_num_epochs: 1 INFO:tensorflow:Maybe overwriting load_pretrained: True INFO:tensorflow:Ignoring config override key: load_pretrained WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1. INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False INFO:tensorflow:Using config: {'_model_dir': 'training/output/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': gpu_options { allow_growth: true } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x00000266AC90B0B8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} WARNING:tensorflow:Estimator's model_fn (<function create_model_fn..model_fn at 0x00000266AC9301E0>) includes params argument, but params are not passed to Estimator. INFO:tensorflow:Not using Distribute Coordinator. INFO:tensorflow:Running training and evaluation locally (non-distributed). INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600. WARNING:tensorflow:From C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. Traceback (most recent call last): File "model_main.py", line 117, in tf.app.run() File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "model_main.py", line 113, in main tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0]) File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 471, in train_and_evaluate return executor.run() File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 611, in run return self.run_local() File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\training.py", line 712, in run_local saving_listeners=saving_listeners) File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 358, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1124, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1151, in _train_model_default input_fn, model_fn_lib.ModeKeys.TRAIN)) File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 992, in _get_features_and_labels_from_input_fn self._call_input_fn(input_fn, mode)) File "C:\Users\dloperab\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1079, in _call_input_fn return input_fn(**kwargs) File "C:\tensorflow\models\research\object_detection\inputs.py", line 446, in _train_input_fn params=params) File "C:\tensorflow\models\research\object_detection\inputs.py", line 512, in train_input model_config, is_training=True).preprocess File "C:\tensorflow\models\research\object_detection\builders\model_builder.py", line 135, in build add_summaries) File "C:\tensorflow\models\research\object_detection\builders\model_builder.py", line 518, in _build_faster_rcnn_model ) = post_processing_builder.build(frcnn_config.second_stage_post_processing) File "C:\tensorflow\models\research\object_detection\builders\post_processing_builder.py", line 59, in build post_processing_config.batch_non_max_suppression) File "C:\tensorflow\models\research\object_detection\builders\post_processing_builder.py", line 95, in _build_non_max_suppressor use_class_agnostic_nms=nms_config.use_class_agnostic_nms, AttributeError: use_class_agnostic_nms

Note: I was able to train few days ago. I don't know what happened!!! Also, I can do detections with pre-trained models and custom trained models.

Thank you!

oldshuren commented 5 years ago

I think the protobuf definition changed. You need to run protobuf compilation again.

dloperab commented 5 years ago

@oldshuren thank you so much, I compiled the protobuf again and be able to train. How did you note that?