keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62.06k stars 19.48k forks source link

FailedPreconditionError: lack of intialization of Keras variables when using with TensorFlow #5427

Closed dumkar closed 7 years ago

dumkar commented 7 years ago

I am new to Keras and just installed it (with pip3) to use with TensorFlow (1.0.0). I am trying to follow the Keras+TensorFlow tutorial.

When running the code, it stops at

train_step.run(feed_dict={img: batch[0], labels: batch[1]})

and throws the error below. I figured out it is because variables are not initialized and fixed it by inserting (see #4623):

keras.backend.get_session().run(tf.global_variables_initializer())

I decided to post it here since I was wondering if this is a general issue with Keras (as this is a rather simple example) regarding the update to TensorFlow 1.0.0 or something specific to my setup?

The error:


FailedPreconditionError Traceback (most recent call last) C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, args) 1021 try: -> 1022 return fn(args) 1023 except errors.OpError as e:

C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata) 1003 feed_dict, fetch_list, target_list, -> 1004 status, run_metadata) 1005

C:\Users\dumon\Anaconda3\lib\contextlib.py in exit(self, type, value, traceback) 65 try: ---> 66 next(self.gen) 67 except StopIteration:

C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py in raise_exception_on_not_ok_status() 468 compat.as_text(pywrap_tensorflow.TF_Message(status)), --> 469 pywrap_tensorflow.TF_GetCode(status)) 470 finally:

FailedPreconditionError: Attempting to use uninitialized value dense_1_W [[Node: dense_1_W/read = IdentityT=DT_FLOAT, _class=["loc:@dense_1_W"], _device="/job:localhost/replica:0/task:0/cpu:0"]]

During handling of the above exception, another exception occurred:

FailedPreconditionError Traceback (most recent call last)

in () 9 batch= mnist_data.train.next_batch(50) 10 train_step.run(feed_dict={img: batch[0], ---> 11 labels: batch[1]}) C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py in run(self, feed_dict, session) 1586 none, the default session will be used. 1587 """ -> 1588 _run_using_default_session(self, feed_dict, self.graph, session) 1589 1590 C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py in _run_using_default_session(operation, feed_dict, graph, session) 3830 "the operation's graph is different from the session's " 3831 "graph.") -> 3832 session.run(operation, feed_dict) 3833 3834 C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) 765 try: 766 result = self._run(None, fetches, feed_dict, options_ptr, --> 767 run_metadata_ptr) 768 if run_metadata: 769 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 963 if final_fetches or final_targets: 964 results = self._do_run(handle, final_targets, final_fetches, --> 965 feed_dict_string, options, run_metadata) 966 else: 967 results = [] C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1013 if handle is None: 1014 return self._do_call(_run_fn, self._session, feed_dict, fetch_list, -> 1015 target_list, options, run_metadata) 1016 else: 1017 return self._do_call(_prun_fn, self._session, handle, feed_dict, C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args) 1033 except KeyError: 1034 pass -> 1035 raise type(e)(node_def, op, message) 1036 1037 def _extend_graph(self): FailedPreconditionError: Attempting to use uninitialized value dense_1_W [[Node: dense_1_W/read = Identity[T=DT_FLOAT, _class=["loc:@dense_1_W"], _device="/job:localhost/replica:0/task:0/cpu:0"](dense_1_W)]] Caused by op 'dense_1_W/read', defined at: File "C:\Users\dumon\Anaconda3\lib\runpy.py", line 184, in _run_module_as_main "__main__", mod_spec) File "C:\Users\dumon\Anaconda3\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\__main__.py", line 3, in app.launch_new_instance() File "C:\Users\dumon\Anaconda3\lib\site-packages\traitlets\config\application.py", line 653, in launch_instance app.start() File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 474, in start ioloop.IOLoop.instance().start() File "C:\Users\dumon\Anaconda3\lib\site-packages\zmq\eventloop\ioloop.py", line 162, in start super(ZMQIOLoop, self).start() File "C:\Users\dumon\Anaconda3\lib\site-packages\tornado\ioloop.py", line 887, in start handler_func(fd_obj, events) File "C:\Users\dumon\Anaconda3\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper return fn(*args, **kwargs) File "C:\Users\dumon\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events self._handle_recv() File "C:\Users\dumon\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv self._run_callback(callback, msg) File "C:\Users\dumon\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback callback(*args, **kwargs) File "C:\Users\dumon\Anaconda3\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper return fn(*args, **kwargs) File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 276, in dispatcher return self.dispatch_shell(stream, msg) File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 228, in dispatch_shell handler(stream, idents, msg) File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 390, in execute_request user_expressions, allow_stdin) File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "C:\Users\dumon\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 501, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "C:\Users\dumon\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "C:\Users\dumon\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2821, in run_ast_nodes if self.run_code(code, result): File "C:\Users\dumon\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 4, in x = Dense(128, activation='relu')(img) # fully-connected layer with 128 units and ReLU activation File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\engine\topology.py", line 546, in __call__ self.build(input_shapes[0]) File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\layers\core.py", line 798, in build constraint=self.W_constraint) File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\engine\topology.py", line 418, in add_weight weight = initializer(shape, name=name) File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\initializations.py", line 66, in glorot_uniform return uniform(shape, s, name=name) File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\initializations.py", line 33, in uniform return K.random_uniform_variable(shape, -scale, scale, name=name) File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 635, in random_uniform_variable return variable(value, dtype=dtype, name=name) File "C:\Users\dumon\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 259, in variable v = tf.Variable(value, dtype=_convert_string_dtype(dtype), name=name) File "C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 226, in __init__ expected_shape=expected_shape) File "C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 344, in _init_from_args self._snapshot = array_ops.identity(self._variable, name="read") File "C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 1490, in identity result = _op_def_lib.apply_op("Identity", input=input, name=name) File "C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op op_def=op_def) File "C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2395, in create_op original_op=self._default_original_op, op_def=op_def) File "C:\Users\dumon\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1264, in __init__ self._traceback = _extract_stack() FailedPreconditionError (see above for traceback): Attempting to use uninitialized value dense_1_W [[Node: dense_1_W/read = Identity[T=DT_FLOAT, _class=["loc:@dense_1_W"], _device="/job:localhost/replica:0/task:0/cpu:0"](dense_1_W)]]
stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

DavidWilt commented 7 years ago

I am as of July 11, 2017, experiencing this bug. It first occurred when I was attempting to train a model, but it now arises when the workaround code above is executed: keras.backend.get_session().run(tf.global_variables_initializer())

Error message below. What else do you need?

Thanks.

ensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense_1/Variable [[Node: dense_1/Variable/read = IdentityT=DT_FLOAT, _class=["loc:@dense_1/Variable"], _device="/job:localhost/replica:0/task:0/cpu:0"]] Caused by op u'dense_1/Variable/read', defined at: File "/Applications/WingIDE.app/Contents/Resources/bin/wingdb.py", line 978, in main() File "/Applications/WingIDE.app/Contents/Resources/bin/wingdb.py", line 918, in main netserver.abstract.kFileSystemEncoding, orig_sys_path) File "/Applications/WingIDE.app/Contents/Resources/bin/wingdb.py", line 766, in DebugFile exit_code = server.Run(filename, sys.argv)

ylmeng commented 6 years ago

I have the same problem. I have a custom layer which works fine in some models, but fails with this message (similar to above) in other models. Totally annoying.

why2face commented 6 years ago

Got the same error while trying to use tensorflow-gpu as backend in keras. Though it worked well using cpu before. How could this make a difference?

AniketDhar commented 6 years ago

Facing the same issue while training Keras model with custom kernel initializers. Also happens if I add BatchNormalization in the model. I already tried tf.global_variables_initializer() before fit, but that did not help. Any suggestions or workarounds?

drsxr commented 6 years ago

Also having the same problem - happens in the BatchNorm layer. Took one version of code, ran it on GPU:0, no problem. Copied the code, Ran it on GPU:1, changed a few of the hyperparameters (learning rate, # of epochs) and get a FailedPreconditionError. Very inconstant, but once it happens in one of my Jupyter Notebooks, it seems reproducible there. Using Keras 2.1.3 and TF 1.8

alexattia commented 6 years ago

I have the same issue, any suggestions?

evertonaleixo commented 6 years ago

I initialize the variables with the following code, and its work for me:

K.set_session(tf.Session(graph=model.output.graph)) init = K.tf.global_variables_initializer() K.get_session().run(init)

where K is from 'from keras import backend as K'. tf is from 'import tensorflow as tf'. And 'model' is my keras model. I add this code after compile the model.

YTSKanuri commented 6 years ago

The only solution that worked for me when using notebook is:-

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    hist = model.fit_generator(
    train_datagen, steps_per_epoch=STEPS, epochs=EPOCHS, verbose=1,
    validation_data=(x_valid, y_valid),
    callbacks = callbacks_list)
nateGeorge commented 5 years ago

For me, I had to use local_variables_initilaizer() -- global_variables_initializer() wouldn't work.

sess = tf.Session()
sess.run(tf.local_variables_initializer())
novioleo commented 5 years ago

same error in latest keras.

novioleo commented 5 years ago

I fixed it with set different graph in session. if there are multi-models in the same project,use the tensorflow default graph to init a new session,and a definitely new graph for tensorflow model.

edufonseca commented 5 years ago

@novioleo could you please share the code snippet that fixed this? thanks

novioleo commented 5 years ago

@novioleo could you please share the code snippet that fixed this? thanks

i'm not exactly sure~but you can have a reference:

# model from **keras** please use the default graph **always**.
# model from tensorflow need to use a totally new graph
default_graph = tf.get_default_graph()
with default_graph.as_default():
    self.sess_1 = tf.Session(config=self.config)
     K.set_session(self.sess_1)
     with self.sess.as_default():
         self.model = modellib.MaskRCNN(mode="inference", model_dir=self.log_dir, config=InferenceConfig())
         self.model.load_weights(self.model_file, by_name=True)

with graph.as_default():
    self.x_ = tf.placeholder(tf.float32, [None, self.img_size])
    self.x_image = tf.reshape(self.x_, [-1, self.img_height, self.img_width, 3])
    self.enhanced = resnet(self.x_image)
    self.sess_2 = tf.Session(config=self.config)
    with self.sess_2.as_default():
        saver = tf.train.Saver()
        saver.restore(self.sess_2, "./path/to/your/model")

i exacted my this from my code,there could be some errors,please fix it by yourself~ when you need the use the model to predict,justwith self.which_session_you_want_to_use:. i suggest you guys can make a model into a class for a better management.

menon92 commented 4 years ago

This solve my issue

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # do others task
LeeKeyu commented 3 years ago

I fixed it with set different graph in session. if there are multi-models in the same project,use the tensorflow default graph to init a new session,and a definitely new graph for tensorflow model.

My issue is also caused by using multi-models in one project (tensorflow model + Keras model). Thank @novioleo for the answer. My issue was solved by initializing the Keras model using a new session defined in the default graph

default_graph = tf.get_default_graph()
with default_graph.as_default():
    self.sess_keras = tf.Session()
    global model
    model = Model()     # keras model

and use this new session during prediction with the keras model:

with self.sess_keras.as_default():
    test_logits = model.predict()