tf.get_variable() error, variable does not exist or was not created

SimpleXP commented 7 years ago

My tensorflow version is 0.12.1

when I run run_main.py, I got this error

"ValueError: Variable discriminator/disc_bn1/discriminator_1/disc_bn1/cond/discriminator_1/disc_bn1/moments/moments_1/mean/ExponentialMovingAverage/biased does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?"

Any one has any idea?

davidz-zzz commented 7 years ago

Maybe you could add: with tf.variable_scope(tf.get_variable_scope(), reuse=False): before ema.apply

https://github.com/carpedm20/DCGAN-tensorflow/issues/59

loliverhennigh commented 7 years ago

This worked for me! (tensorflow 1.0 alpha

chulaihunde commented 7 years ago

This do not worked for me! (tensorflow 1.0 nightly）

Traceback (most recent call last):

File "main.py", line 55, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "main.py", line 44, in main
    FLAGS.optimizer_param)
  File "/home/long/MyCode2/WassersteinGAN.tensorflow-master/models/GAN_models.py", line 197, in create_network
    scope_reuse=True)
  File "/home/long/MyCode2/WassersteinGAN.tensorflow-master/models/GAN_models.py", line 118, in _discriminator
    h_bn = utils.batch_norm(h_conv, dims[index + 1], train_phase, scope="disc_bn%d" % index)
  File "/home/long/MyCode2/WassersteinGAN.tensorflow-master/utils.py", line 145, in batch_norm
    lambda: (ema.average(batch_mean), ema.average(batch_var)))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1741, in cond
    orig_res, res_t = context_t.BuildCondBranch(fn1)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1642, in BuildCondBranch
    r = fn()
  File "/home/long/MyCode2/WassersteinGAN.tensorflow-master/utils.py", line 139, in mean_var_with_update
    ema_apply_op = ema.apply([batch_mean, batch_var])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py", line 375, in apply
    colocate_with_primary=(var.op.type in ["Variable", "VariableV2"]))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 135, in create_zeros_slot
    colocate_with_primary=colocate_with_primary)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 112, in create_slot
    return _create_slot_var(primary, val, "")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 64, in _create_slot_var
    validate_shape=val.get_shape().is_fully_defined())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 1033, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 932, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 356, in get_variable
    validate_shape=validate_shape, use_resource=use_resource)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
    use_resource=use_resource)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 671, in _get_single_variable
    "VarScope?" % name)
ValueError: Variable discriminator/disc_bn1/discriminator_1/disc_bn1/moments/moments_1/mean/ExponentialMovingAverage/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

kunrenzhilu commented 7 years ago

It spent me two days to figure out the workaround and ends up with failure. It seems the reason is that although the fake_discriminator set the scope_reuse to be True, however, the tf.cond() statement will create a new control_flow every time, such that the get_variable() cannot retrieve the corresponding variables from the real_discriminator and throw a ValueError .../.../discriminator_1/disc_bn1/... blablabla. Bcuz according to my understanding, there shouldn't be a nested scope ../../discriminator_1 and nested ../../../disc_bn1. Tell me if I am wrong. Anyway, I cannot make changes base on the original code. My workaround was to change to tf.contrib.layers.batch_norm(). Done with one statement.

lengoanhcat commented 7 years ago

@kunrenzhilu : could you be more specific about how you modify tf.contrib.layers.batch_norm() ? I am struggling with the same problem stated above.

bottlecapper commented 7 years ago

I have the same problem. After adding: with tf.variable_scope(tf.get_variable_scope(), reuse=False): before ema.apply There comes another problem at model.initialize_network(FLAGS.logs_dir):

Traceback (most recent call last):
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _do_call
    return fn(*args)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1003, in _run_fn
    status, run_metadata)
  File "/home/jg/miniconda3/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype bool
     [[Node: Placeholder = Placeholder[dtype=DT_BOOL, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/jg/F/20170514/main.py", line 54, in <module>
    tf.app.run()
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/media/jg/F/20170514/main.py", line 45, in main
    model.initialize_network(FLAGS.logs_dir)
  File "/media/jg/F/20170514/models/GAN_models.py", line 225, in initialize_network
    self.sess.run(tf.global_variables_initializer())
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype bool
     [[Node: Placeholder = Placeholder[dtype=DT_BOOL, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op 'Placeholder', defined at:
  File "/media/jg/F/20170514/main.py", line 54, in <module>
    tf.app.run()
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/media/jg/F/20170514/main.py", line 43, in main
    FLAGS.optimizer_param)
  File "/media/jg/F/20170514/models/GAN_models.py", line 173, in create_network
    self._setup_placeholder()
  File "/media/jg/F/20170514/models/GAN_models.py", line 149, in _setup_placeholder
    self.train_phase = tf.placeholder(tf.bool)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1587, in placeholder
    name=name)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2043, in _placeholder
    name=name)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/jg/miniconda3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder' with dtype bool
     [[Node: Placeholder = Placeholder[dtype=DT_BOOL, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Process finished with exit code 1

AshishBora commented 7 years ago

Check this out. It seems to have fixed the problems for me. https://github.com/AshishBora/WassersteinGAN.tensorflow/commit/1c6cfa1c20959e9dcca01f0a96f7ca8c54403d1a

UPDATE: After training for 8+ hours with this change, the GAN seems to not learn anything and the loss ranges (for d_loss and g_loss) are way off.

UPDATE 2: I trained with this commit and TF v1.1.0. It seems to have learned to produce faces.

kinsumliu commented 7 years ago

@AshishBora Hi, may you report the number you get for generator and discriminator loss? I am doing WGAN for MNIST images and I see g loss is ~200 and d loss is ~0.003 in the first hour.