google / prettytensor

Pretty Tensor: Fluent Networks in TensorFlow
1.24k stars 151 forks source link

Issue with tensorflow variable names #20

Closed fftobiwan closed 8 years ago

fftobiwan commented 8 years ago

Hi,

I just wanted to try the introductory example and got the following error:

Traceback (most recent call last):
  File "testpt.py", line 15, in <module>
    softmax_classifier(CLASSES, labels=label_tensor, name="sm"))
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 2019, in method
    return func(input_layer, *args, **self.fill_kwargs(input_layer, kwargs))
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_loss_methods.py", line 401, in softmax_classifier
    init=weight_init, bias_init=bias_init)
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 1980, in method
    result = func(non_seq_layer, *args, **kwargs)
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_methods.py", line 333, in __call__
    dt=dtype)
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 1694, in variable
    collections=variable_collections)
  File "/home/code/ml/tensorflow/_python_build/tensorflow/python/ops/variable_scope.py", line 334, in get_variable
    collections=collections)
  File "/home/code/ml/tensorflow/_python_build/tensorflow/python/ops/variable_scope.py", line 257, in get_variable
    collections=collections, caching_device=caching_device)
  File "/home/code/ml/tensorflow/_python_build/tensorflow/python/ops/variable_scope.py", line 118, in get_variable
    name, "".join(traceback.format_list(tb))))
ValueError: Variable weights already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 1694, in variable
    collections=variable_collections)
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_methods.py", line 333, in __call__
    dt=dtype)
  File "/home/.local/lib/python2.7/site-packages/prettytensor/pretty_tensor_class.py", line 1980, in method
    result = func(non_seq_layer, *args, **kwargs)

The code that I try to run is as follows

import tensorflow as tf
import prettytensor as pt
import numpy as np

BATCH_SIZE = 100
DATA_SIZE = 28*28
CLASSES = 10

input_tensor = tf.placeholder(np.float32, shape=(BATCH_SIZE, DATA_SIZE))
label_tensor = tf.placeholder(np.float32, shape=(BATCH_SIZE, CLASSES))
pretty_input = pt.wrap(input_tensor)

softmax, loss = (pretty_input.
                     fully_connected(100, name="fc1").
                     softmax_classifier(CLASSES, labels=label_tensor, name="sm"))

It seems that variable tensors are not scoped properly. Am I doing something wrong?

Kind regards, Tobias

eiderman commented 8 years ago

In the latest version that I try, I don't get that error. I can reproduce it if I put everything in a namescope and call it twice.

This is a limitation of how name_scopes and variable_scopes work together and is is recommended to prefer variable_scope when you are creating sub-graphs with variables (which all the layers do).

If you are using an interactive shell like jupyter, I find using an explicit graph when playing around (e.g. with tf.Graph().as_default():) really helps cut down on interference type errors.

fftobiwan commented 8 years ago

I think the problem was that the variable scope was not set properly in scopes.var_and_name_scope().

I applied the following change to get everything working:

diff --git a/prettytensor/scopes.py b/prettytensor/scopes.py
index 809b32a..d1a914e 100644
--- a/prettytensor/scopes.py
+++ b/prettytensor/scopes.py
@@ -69,7 +69,8 @@ def var_and_name_scope(names):
               initializer=old_vs.initializer)

         vs_key[0].name_scope = scope
-        yield scope, vs_key[0]
+        with tf.variable_scope(vs_key[0]):
+          yield scope, vs_key[0]
       finally:
         vs_key[0] = old_vs

Now the variables have properly scoped names like "fully_connected/weights" instead of just "weights".

I am using TensorFlow 0.7.1

Wakeupbuddy commented 8 years ago

I had the same error when I rebase to 0.8.0. @fftobiwan 's fix also works for me.

eiderman commented 8 years ago

Thanks for the update. It appears that a certain change has landed, so I will upload with the fix.

eiderman commented 8 years ago

I've made the update -- this was related to changes in the semantics of tf.get_collection() and tf.get_collection_ref().

Please verify that it works, thanks again!

fftobiwan commented 8 years ago

It works for me, thank you very much.

ggardu commented 5 years ago

Zoe from stackOverflow said: tf.reset_default_graph()
at the beginning of the script. Worked for me. Thanks,