tensorflow / models

Models and examples built with TensorFlow
Other
77.16k stars 45.75k forks source link

Name conflict: tensorflow.contrib.slim vs tf_slim #8594

Open nobutoba opened 4 years ago

nobutoba commented 4 years ago

Prerequisites

1. The entire URL of the file you are using

https://github.com/tensorflow/models/blob/master/research/slim/README.md https://github.com/tensorflow/models/blob/master/research/slim/slim_walkthrough.ipynb

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

Full log for the 6th cell in the notebook, run with TensorFlow 1.15.3 ```python WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow_core/python/ops/losses/losses_impl.py:121: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where WARNING:tensorflow:From :16: get_total_loss (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.get_total_loss instead. WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:236: get_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.get_losses instead. WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:238: get_regularization_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.get_regularization_losses instead. WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/learning.py:734: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Starting Session. INFO:tensorflow:Saving checkpoint to path /tmp/regression_model/model.ckpt INFO:tensorflow:global_step/sec: 0 INFO:tensorflow:Starting Queues. INFO:tensorflow:global step 499: loss = 0.4413 (0.001 sec/step) INFO:tensorflow:global step 999: loss = 0.2760 (0.001 sec/step) INFO:tensorflow:global step 1499: loss = 0.2333 (0.001 sec/step) INFO:tensorflow:global step 1999: loss = 0.2453 (0.001 sec/step) INFO:tensorflow:global step 2499: loss = 0.1999 (0.001 sec/step) INFO:tensorflow:global_step/sec: 547.203 INFO:tensorflow:global step 2999: loss = 0.1675 (0.001 sec/step) INFO:tensorflow:global step 3499: loss = 0.1778 (0.001 sec/step) INFO:tensorflow:global step 3999: loss = 0.2127 (0.001 sec/step) INFO:tensorflow:global step 4499: loss = 0.1784 (0.001 sec/step) INFO:tensorflow:global step 4999: loss = 0.1660 (0.001 sec/step) INFO:tensorflow:Stopping Training. INFO:tensorflow:Finished training! Saving model to disk. Finished training. Last batch loss: 0.16604608 Checkpoint saved in /tmp/regression_model/ /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow_core/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. " ```
Full log for the 9th cell in the notebook, run with TensorFlow 1.15.3 ```python WARNING:tensorflow:From :7: streaming_mean_squared_error (from tf_slim.metrics.metric_ops) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.metrics.mean_squared_error. Note that the order of the labels and predictions arguments has been switched. WARNING:tensorflow:From :8: streaming_mean_absolute_error (from tf_slim.metrics.metric_ops) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.metrics.mean_absolute_error. Note that the order of the labels and predictions arguments has been switched. INFO:tensorflow:Restoring parameters from /tmp/regression_model/model.ckpt INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Starting standard services. INFO:tensorflow:Saving checkpoint to path /tmp/regression_model/model.ckpt INFO:tensorflow:Starting queue runners. INFO:tensorflow:Error reported to Coordinator: , 'module' object is not callable --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in 16 num_evals=1, # Single pass over data 17 eval_op=names_to_update_nodes.values(), ---> 18 final_op=names_to_value_nodes.values()) 19 20 names_to_values = dict(zip(names_to_value_nodes.keys(), metric_values)) TypeError: 'module' object is not callable ```
Full log for the 6th cell in the notebook, run with TensorFlow 2.2.0 ```python WARNING:tensorflow:From :16: get_total_loss (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.get_total_loss instead. WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:236: get_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.get_losses instead. WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/losses/loss_ops.py:238: get_regularization_losses (from tf_slim.losses.loss_ops) is deprecated and will be removed after 2016-12-30. Instructions for updating: Use tf.losses.get_regularization_losses instead. WARNING:tensorflow:From /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/learning.py:734: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Starting Session. INFO:tensorflow:Saving checkpoint to path /tmp/regression_model/model.ckpt INFO:tensorflow:Error reported to Coordinator: An op outside of the function building code is being passed a "Graph" tensor. It is possible to have Graph tensors leak out of the function building context by including a tf.init_scope in your function building code. For example, the following function will fail: @tf.function def has_init_scope(): my_constant = tf.constant(1.) with tf.init_scope(): added = my_constant * 2 The graph tensor has name: global_step:0 Traceback (most recent call last): File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 470, in read_variable_op tld.op_callbacks, resource, "dtype", dtype) tensorflow.python.eager.core._FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception yield File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py", line 485, in run self.start_loop() File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/supervisor.py", line 1077, in start_loop self._last_step = training_util.global_step(self._sess, self._step_counter) File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/training_util.py", line 67, in global_step return int(global_step_tensor.numpy()) File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 603, in numpy return self.read_value().numpy() File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 666, in read_value value = self._read_variable_op() File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 645, in _read_variable_op result = read_and_set_handle() File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 636, in read_and_set_handle self._dtype) File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 475, in read_variable_op resource, dtype=dtype, name=name, ctx=_ctx) File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 502, in read_variable_op_eager_fallback attrs=_attrs, ctx=ctx, name=name) File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 75, in quick_execute raise e File "/home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute inputs, attrs, num_outputs) TypeError: An op outside of the function building code is being passed a "Graph" tensor. It is possible to have Graph tensors leak out of the function building context by including a tf.init_scope in your function building code. For example, the following function will fail: @tf.function def has_init_scope(): my_constant = tf.constant(1.) with tf.init_scope(): added = my_constant * 2 The graph tensor has name: global_step:0 INFO:tensorflow:Starting Queues. INFO:tensorflow:Finished training! Saving model to disk. /home/username/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/summary/writer/writer.py:388: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. " --------------------------------------------------------------------------- _FallbackException Traceback (most recent call last) ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py in read_variable_op(resource, dtype, name) 469 _ctx._context_handle, tld.device_name, "ReadVariableOp", name, --> 470 tld.op_callbacks, resource, "dtype", dtype) 471 return _result _FallbackException: This function does not handle the case of the path where all inputs are not already EagerTensors. During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) in 26 number_of_steps=5000, 27 save_summaries_secs=5, ---> 28 log_every_n_steps=500) 29 30 print("Finished training. Last batch loss:", final_loss) ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tf_slim/learning.py in train(train_op, logdir, train_step_fn, train_step_kwargs, log_every_n_steps, graph, master, is_chief, global_step, number_of_steps, init_op, init_feed_dict, local_init_op, init_fn, ready_op, summary_op, save_summaries_secs, summary_writer, startup_delay_steps, saver, save_interval_secs, sync_optimizer, session_config, session_wrapper, trace_every_n_steps, ignore_live_threads) 780 threads, 781 close_summary_writer=True, --> 782 ignore_live_threads=ignore_live_threads) 783 784 except errors.AbortedError: ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/supervisor.py in stop(self, threads, close_summary_writer, ignore_live_threads) 837 threads, 838 stop_grace_period_secs=self._stop_grace_secs, --> 839 ignore_live_threads=ignore_live_threads) 840 finally: 841 # Close the writer last, in case one of the running threads was using it. ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py in join(self, threads, stop_grace_period_secs, ignore_live_threads) 387 self._registered_threads = set() 388 if self._exc_info_to_raise: --> 389 six.reraise(*self._exc_info_to_raise) 390 elif stragglers: 391 if ignore_live_threads: ~/tensorflow-slim/.venv/lib/python3.7/site-packages/six.py in reraise(tp, value, tb) 701 if value.__traceback__ is not tb: 702 raise value.with_traceback(tb) --> 703 raise value 704 finally: 705 value = None ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py in stop_on_exception(self) 295 """ 296 try: --> 297 yield 298 except: # pylint: disable=bare-except 299 self.request_stop(ex=sys.exc_info()) ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py in run(self) 483 def run(self): 484 with self._coord.stop_on_exception(): --> 485 self.start_loop() 486 if self._timer_interval_secs is None: 487 # Call back-to-back. ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/supervisor.py in start_loop(self) 1075 def start_loop(self): 1076 self._last_time = time.time() -> 1077 self._last_step = training_util.global_step(self._sess, self._step_counter) 1078 1079 def run_loop(self): ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/training/training_util.py in global_step(sess, global_step_tensor) 65 """ 66 if context.executing_eagerly(): ---> 67 return int(global_step_tensor.numpy()) 68 return int(sess.run(global_step_tensor)) 69 ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in numpy(self) 601 def numpy(self): 602 if context.executing_eagerly(): --> 603 return self.read_value().numpy() 604 raise NotImplementedError( 605 "numpy() is only available when eager execution is enabled.") ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in read_value(self) 664 """ 665 with ops.name_scope("Read"): --> 666 value = self._read_variable_op() 667 # Return an identity so it can get placed on whatever device the context 668 # specifies instead of the device where the variable is. ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in _read_variable_op(self) 643 result = read_and_set_handle() 644 else: --> 645 result = read_and_set_handle() 646 647 if not context.executing_eagerly(): ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py in read_and_set_handle() 634 def read_and_set_handle(): 635 result = gen_resource_variable_ops.read_variable_op(self._handle, --> 636 self._dtype) 637 _maybe_set_handle_data(self._dtype, self._handle, result) 638 return result ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py in read_variable_op(resource, dtype, name) 473 try: 474 return read_variable_op_eager_fallback( --> 475 resource, dtype=dtype, name=name, ctx=_ctx) 476 except _core._SymbolicException: 477 pass # Add nodes to the TensorFlow graph. ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py in read_variable_op_eager_fallback(resource, dtype, name, ctx) 500 _attrs = ("dtype", dtype) 501 _result = _execute.execute(b"ReadVariableOp", 1, inputs=_inputs_flat, --> 502 attrs=_attrs, ctx=ctx, name=name) 503 if _execute.must_record_gradient(): 504 _execute.record_gradient( ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 73 "Inputs to eager execution function cannot be Keras symbolic " 74 "tensors, but found {}".format(keras_symbolic_tensors)) ---> 75 raise e 76 # pylint: enable=protected-access 77 return tensors ~/tensorflow-slim/.venv/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 58 ctx.ensure_initialized() 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, ---> 60 inputs, attrs, num_outputs) 61 except core._NotOkStatusException as e: 62 if name is not None: TypeError: An op outside of the function building code is being passed a "Graph" tensor. It is possible to have Graph tensors leak out of the function building context by including a tf.init_scope in your function building code. For example, the following function will fail: @tf.function def has_init_scope(): my_constant = tf.constant(1.) with tf.init_scope(): added = my_constant * 2 The graph tensor has name: global_step:0 ```

6. System information

marksandler2 commented 4 years ago

Thanks for the report. The notebook does need to be fixed. Slim will probably never work in tensorflow 2.0 eager mode, only in graph mode. Thus command line examples won't work as is. We should restore that header in the README.MD with some caveats.

kyscg commented 4 years ago

What is the best way to fix this? Should we mention that [TensorFlow 2 might not be supported] in the README or change all instances of from tensorflow.contrib import slim to import tf_slim as slim. Or maybe both?

Slim will probably never work in tensorflow 2.0 eager mode, only in graph mode

Is this being deprecated or is there any other reason it doesn't work?

marksandler2 commented 4 years ago

We will update the readme.md and notebook shortly. Skim is mostly in maintenance mode but full on tf2 support basically requires a very thorough rewrite and not all concepts of slim map nicely in tf2.

On Wed, Jun 3, 2020, 5:13 PM Kilaru Yasaswi Sri Chandra Gandhi < notifications@github.com> wrote:

What is the best way to fix this? Should we mention that [TensorFlow 2 might not be supported] in the README or change all instances of from tensorflow.contrib import slim to import tf_slim as slim. Or maybe both?

Slim will probably never work in tensorflow 2.0 eager mode, only in graph mode

Is this being deprecated or is there any other reason it doesn't work?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/8594#issuecomment-638525440, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVIDWDFJ2EWIR5WIANE4Q3RU3RLFANCNFSM4NOQITFQ .