WARNING:tensorflow:From <ipython-input-19-652953e0a3d5>:22: shuffle_and_repeat (from tensorflow.contrib.data.python.ops.shuffle_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.shuffle_and_repeat(...)`.
WARNING:tensorflow:From C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\contrib\data\python\ops\shuffle_ops.py:54: shuffle_and_repeat (from tensorflow.python.data.experimental.ops.shuffle_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.shuffle(buffer_size, seed)` followed by `tf.data.Dataset.repeat(count)`. Static tf.data optimizations will take care of using the fused implementation.
WARNING:tensorflow:From <ipython-input-19-652953e0a3d5>:31: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.map_and_batch(...)`.
WARNING:tensorflow:From C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\contrib\data\python\ops\batching.py:276: map_and_batch (from tensorflow.python.data.experimental.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map(map_func, num_parallel_calls)` followed by `tf.data.Dataset.batch(batch_size, drop_remainder)`. Static tf.data optimizations will take care of using the fused implementation.
WARNING:tensorflow:From C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.read_file is deprecated. Please use tf.io.read_file instead.
WARNING:tensorflow:From C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.
WARNING:tensorflow:From <ipython-input-19-652953e0a3d5>:13: calling string_split (from tensorflow.python.ops.ragged.ragged_string_ops) with delimiter is deprecated and will be removed in a future version.
Instructions for updating:
delimiter is deprecated, please use sep instead.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:batch_all_reduce: 8 all-reduces with algorithm = hierarchical_copy, num_packs = 1, agg_small_grads_max_bytes = 0 and agg_small_grads_max_group = 10
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Warm-starting with WarmStartSettings: WarmStartSettings(ckpt_to_initialize_from='C:\\Users\\STEVEN~1\\AppData\\Local\\Temp\\tmpd8genxgi\\keras\\keras_model.ckpt', vars_to_warm_start='.*', var_name_to_vocab_info={}, var_name_to_prev_var_name={})
INFO:tensorflow:Warm-starting from: C:\Users\STEVEN~1\AppData\Local\Temp\tmpd8genxgi\keras\keras_model.ckpt
INFO:tensorflow:Warm-starting variables only in TRAINABLE_VARIABLES.
INFO:tensorflow:Warm-started 28 variables.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\STEVEN~1\AppData\Local\Temp\tmpd8genxgi\model.ckpt.
---------------------------------------------------------------------------
NotFoundError Traceback (most recent call last)
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in _do_call(self, fn, *args)
1364 try:
-> 1365 return fn(*args)
1366 except errors.OpError as e:
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1349 return self._call_tf_sessionrun(options, feed_dict, fetch_list,
-> 1350 target_list, run_metadata)
1351
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1442 fetch_list, target_list,
-> 1443 run_metadata)
1444
NotFoundError: 2 root error(s) found.
(0) Not found: FindFirstFile failed for: OCT2017/train : The system cannot find the path specified.
; No such process
[[{{node list_files/MatchingFiles}}]]
[[MultiDeviceIteratorInit/_801]]
(1) Not found: FindFirstFile failed for: OCT2017/train : The system cannot find the path specified.
; No such process
[[{{node list_files/MatchingFiles}}]]
0 successful operations.
1 derived errors ignored.
During handling of the above exception, another exception occurred:
NotFoundError Traceback (most recent call last)
<ipython-input-26-e9b94ace2029> in <module>
11 num_epochs=EPOCHS,
12 prefetch_buffer_size=4),
---> 13 hooks=[time_hist])
~\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
368
369 saving_listeners = _check_listeners_type(saving_listeners)
--> 370 loss = self._train_model(input_fn, hooks, saving_listeners)
371 logging.info('Loss for final step: %s.', loss)
372 return self
~\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
1157 def _train_model(self, input_fn, hooks, saving_listeners):
1158 if self._train_distribution:
-> 1159 return self._train_model_distributed(input_fn, hooks, saving_listeners)
1160 else:
1161 return self._train_model_default(input_fn, hooks, saving_listeners)
~\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py in _train_model_distributed(self, input_fn, hooks, saving_listeners)
1220 self._config._train_distribute.configure(self._config.session_config)
1221 return self._actual_train_model_distributed(
-> 1222 self._config._train_distribute, input_fn, hooks, saving_listeners)
1223 # pylint: enable=protected-access
1224
~\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py in _actual_train_model_distributed(self, strategy, input_fn, hooks, saving_listeners)
1331 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
1332 hooks, global_step_tensor,
-> 1333 saving_listeners)
1334
1335 def _train_with_estimator_spec_distributed(self, estimator_spec, worker_hooks,
~\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py in _train_with_estimator_spec(self, estimator_spec, worker_hooks, hooks, global_step_tensor, saving_listeners)
1488 config=self._session_config,
1489 max_wait_secs=self._config.session_creation_timeout_secs,
-> 1490 log_step_count_steps=log_step_count_steps) as mon_sess:
1491 loss = None
1492 any_step_done = False
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\training\monitored_session.py in MonitoredTrainingSession(master, is_chief, checkpoint_dir, scaffold, hooks, chief_only_hooks, save_checkpoint_secs, save_summaries_steps, save_summaries_secs, config, stop_grace_period_secs, log_step_count_steps, max_wait_secs, save_checkpoint_steps, summary_dir)
582 session_creator=session_creator,
583 hooks=all_hooks,
--> 584 stop_grace_period_secs=stop_grace_period_secs)
585
586
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\training\monitored_session.py in __init__(self, session_creator, hooks, stop_grace_period_secs)
1012 hooks,
1013 should_recover=True,
-> 1014 stop_grace_period_secs=stop_grace_period_secs)
1015
1016
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\training\monitored_session.py in __init__(self, session_creator, hooks, should_recover, stop_grace_period_secs)
723 stop_grace_period_secs=stop_grace_period_secs)
724 if should_recover:
--> 725 self._sess = _RecoverableSession(self._coordinated_creator)
726 else:
727 self._sess = self._coordinated_creator.create_session()
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\training\monitored_session.py in __init__(self, sess_creator)
1205 """
1206 self._sess_creator = sess_creator
-> 1207 _WrappedSession.__init__(self, self._create_session())
1208
1209 def _create_session(self):
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\training\monitored_session.py in _create_session(self)
1210 while True:
1211 try:
-> 1212 return self._sess_creator.create_session()
1213 except _PREEMPTION_ERRORS as e:
1214 logging.info(
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\training\monitored_session.py in create_session(self)
883 # Inform the hooks that a new session has been created.
884 for hook in self._hooks:
--> 885 hook.after_create_session(self.tf_sess, self.coord)
886 return _CoordinatedSession(
887 _HookedSession(self.tf_sess, self._hooks), self.coord,
~\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\util.py in after_create_session(***failed resolving arguments***)
102 def after_create_session(self, session, coord):
103 del coord
--> 104 session.run(self._initializer)
105
106
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
954 try:
955 result = self._run(None, fetches, feed_dict, options_ptr,
--> 956 run_metadata_ptr)
957 if run_metadata:
958 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1178 if final_fetches or final_targets or (handle and feed_dict_tensor):
1179 results = self._do_run(handle, final_targets, final_fetches,
-> 1180 feed_dict_tensor, options, run_metadata)
1181 else:
1182 results = []
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1357 if handle is None:
1358 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1359 run_metadata)
1360 else:
1361 return self._do_call(_prun_fn, handle, feeds, fetches)
~\.conda\envs\2020\lib\site-packages\tensorflow_core\python\client\session.py in _do_call(self, fn, *args)
1382 '\nsession_config.graph_options.rewrite_options.'
1383 'disable_meta_optimizer = True')
-> 1384 raise type(e)(node_def, op, message)
1385
1386 def _extend_graph(self):
NotFoundError: 2 root error(s) found.
(0) Not found: FindFirstFile failed for: OCT2017/train : The system cannot find the path specified.
; No such process
[[node list_files/MatchingFiles (defined at C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
[[MultiDeviceIteratorInit/_801]]
(1) Not found: FindFirstFile failed for: OCT2017/train : The system cannot find the path specified.
; No such process
[[node list_files/MatchingFiles (defined at C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
0 successful operations.
1 derived errors ignored.
Original stack trace for 'list_files/MatchingFiles':
File "C:\Users\NN\.conda\envs\2020\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\NN\.conda\envs\2020\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\traitlets\config\application.py", line 664, in launch_instance
app.start()
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\kernelapp.py", line 612, in start
self.io_loop.start()
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\platform\asyncio.py", line 199, in start
self.asyncio_loop.run_forever()
File "C:\Users\NN\.conda\envs\2020\lib\asyncio\base_events.py", line 442, in run_forever
self._run_once()
File "C:\Users\NN\.conda\envs\2020\lib\asyncio\base_events.py", line 1462, in _run_once
handle._run()
File "C:\Users\NN\.conda\envs\2020\lib\asyncio\events.py", line 145, in _run
self._callback(*self._args)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\ioloop.py", line 688, in <lambda>
lambda f: self._run_callback(functools.partial(callback, future))
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\ioloop.py", line 741, in _run_callback
ret = callback()
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 814, in inner
self.ctx_run(self.run)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 162, in _fake_ctx_run
return f(*args, **kw)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 775, in run
yielded = self.gen.send(value)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\kernelbase.py", line 381, in dispatch_queue
yield self.process_one()
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 250, in wrapper
runner = Runner(ctx_run, result, future, yielded)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 741, in __init__
self.ctx_run(self.run)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 162, in _fake_ctx_run
return f(*args, **kw)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 775, in run
yielded = self.gen.send(value)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\kernelbase.py", line 365, in process_one
yield gen.maybe_future(dispatch(*args))
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 234, in wrapper
yielded = ctx_run(next, result)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 162, in _fake_ctx_run
return f(*args, **kw)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\kernelbase.py", line 268, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 234, in wrapper
yielded = ctx_run(next, result)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 162, in _fake_ctx_run
return f(*args, **kw)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\kernelbase.py", line 545, in execute_request
user_expressions, allow_stdin,
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 234, in wrapper
yielded = ctx_run(next, result)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tornado\gen.py", line 162, in _fake_ctx_run
return f(*args, **kw)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\ipkernel.py", line 306, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\ipykernel\zmqshell.py", line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\IPython\core\interactiveshell.py", line 2867, in run_cell
raw_cell, store_history, silent, shell_futures)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\IPython\core\interactiveshell.py", line 2895, in _run_cell
return runner(coro)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\IPython\core\async_helpers.py", line 68, in _pseudo_sync_runner
coro.send(None)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\IPython\core\interactiveshell.py", line 3072, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\IPython\core\interactiveshell.py", line 3263, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-26-e9b94ace2029>", line 13, in <module>
hooks=[time_hist])
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 370, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1159, in _train_model
return self._train_model_distributed(input_fn, hooks, saving_listeners)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1222, in _train_model_distributed
self._config._train_distribute, input_fn, hooks, saving_listeners)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1258, in _actual_train_model_distributed
input_fn, ModeKeys.TRAIN, strategy)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1012, in _get_iterator_from_input_fn
lambda input_context: self._call_input_fn(input_fn, mode,
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 1050, in make_input_fn_iterator
input_fn, replication_mode)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 577, in make_input_fn_iterator
input_fn, replication_mode=replication_mode)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\distribute\mirrored_strategy.py", line 552, in _make_input_fn_iterator
self._container_strategy())
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\distribute\input_lib.py", line 719, in __init__
result = input_fn(ctx)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1013, in <lambda>
input_context))
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1116, in _call_input_fn
return input_fn(**kwargs)
File "<ipython-input-26-e9b94ace2029>", line 12, in <lambda>
prefetch_buffer_size=4),
File "<ipython-input-19-652953e0a3d5>", line 19, in input_fn
dataset = tf.data.Dataset.list_files(file_pattern, shuffle=shuffle)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 1864, in list_files
return DatasetV1Adapter(DatasetV2.list_files(file_pattern, shuffle, seed))
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 833, in list_files
matching_files = gen_io_ops.matching_files(file_pattern)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\ops\gen_io_ops.py", line 464, in matching_files
"MatchingFiles", pattern=pattern, name=name)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "C:\Users\NN\.conda\envs\2020\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
On a positive note, multiple GPUs are working.
I have updated the code as follows to resolve the error:
Code block 1
import os
import time
#!pip install -q -U tensorflow-gpu
import tensorflow as tf
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
config.log_device_placement = True
sess = tf.Session(config=config)
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
import numpy as np
and all syntax with tf.contrib.distribute.MirroredStrategy(num_gpus=NUM_GPUS) are replaced with tf.contrib.distribute.MirroredStrategy(num_gpus=NUM_GPUS,cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
I have also updated the batch size to see if this resolves it, however none of the updates work.
At code block 26
I get the following error:
On a positive note, multiple GPUs are working. I have updated the code as follows to resolve the error: Code block 1
and all syntax with tf.contrib.distribute.MirroredStrategy(num_gpus=NUM_GPUS) are replaced with
tf.contrib.distribute.MirroredStrategy(num_gpus=NUM_GPUS,cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
I have also updated the batch size to see if this resolves it, however none of the updates work.My environment setup :
CUDA Detect Output: