tensorflow / models

Models and examples built with TensorFlow
Other
77.16k stars 45.76k forks source link

Discriminator error in cifar/train.py #3851

Closed SHANKARMB closed 6 years ago

SHANKARMB commented 6 years ago

I created an custom dataset with one class and passing it to this gan.. it gives this error in gan_images labels_to_names None labels Tensor("inputs/one_hot:0", shape=(6, 1), dtype=float32, device=/device:CPU:0) dataset.num_samples 15 dataset.num_classes 1 dataset_name: gan_images dataset_dir: after provide_data images Tensor("inputs/batch:0", shape=(6, 256, 256, 3), dtype=float32, device=/device:CPU:0) in Discriminator for CIFAR images returning from Discriminator for CIFAR images and 'logits': Tensor("Discriminator/Discriminator/Reshape:0", shape=(6, 1), dtype=float32) in Discriminator for CIFAR images Traceback (most recent call last): File "cifar/train.py", line 183, in tf.app.run() File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "cifar/train.py", line 106, in main generator_inputs=generator_inputs) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/gan/python/train.py", line 113, in gan_model discriminator_real_outputs = discriminator_fn(real_data, generatorinputs) File "/home/prime/Final Sem Project/code/models/research/gan/cifar/networks.py", line 104, in discriminator logits, = dcgan.discriminator(img, is_training=is_training) File "/home/prime/Final Sem Project/code/models/research/gan/cifar/nets/dcgan.py", line 97, in discriminator net, current_depth, normalizer_fn=normalizerfn, scope=scope) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args return func(args, current_args) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1050, in convolution outputs = layer.apply(inputs) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 809, in apply return self.call(inputs, args, kwargs) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 680, in call self.build(input_shapes) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/layers/convolutional.py", line 143, in build dtype=self.dtype) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/layers/base.py", line 533, in add_variable partitioner=partitioner) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1297, in get_variable constraint=constraint) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1093, in get_variable constraint=constraint) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 431, in get_variable return custom_getter(custom_getter_kwargs) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1613, in layer_variable_getter return _model_variable_getter(getter, args, kwargs) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1604, in _model_variable_getter use_resource=use_resource) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args return func(*args, *current_args) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 291, in model_variable use_resource=use_resource) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args return func(args, current_args) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 246, in variable use_resource=use_resource) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 408, in _true_getter use_resource=use_resource, constraint=constraint) File "/home/prime/.virtualenvs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 765, in _get_single_variable "reuse=tf.AUTO_REUSE in VarScope?" % name) ValueError: Variable Discriminator/Discriminator/conv6/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=tf.AUTO_REUSE in VarScope?

karmel commented 6 years ago

Can you please:

  1. Fill out the issues template.
  2. Provide a minimal example demonstrating the problem-- this would involve replicating the problem with a subset of the dataset in question, so we have a better sense of what is going on.
  3. Verify that you can run the model using the referenced datasets.

That will allow us to better understand the problem.

SHANKARMB commented 6 years ago

ok..i will share it in a while..

SHANKARMB commented 6 years ago

basically.. i have 10 pics of cat... i want to train my gan on it.. so i created my own dataset reader (conver_gan_images.py and gan_images.py) which is pretty similar to download_and_convert_cifar10.py cifar10.py (which are in models/research/gan/cifar/datasets/) .. I have attached the file url for your ref..

https://github.com/SHANKARMB/models/tree/sketch_to_image/research/gan/cifar/datasets/gan_images.py https://github.com/SHANKARMB/models/tree/sketch_to_image/research/gan/cifar/datasets/conver_gan_images.py

This dataset is just of one class and images are of 256 by 256 by3

It works fine, when I run it on cifar10 dataset. But throws the above error on my dataset.

I also added my dataset_name in dataset_factory to use my database..

Here is the train file I'm using to run. I didn't change it much, but thought shld mention it.. https://github.com/SHANKARMB/models/blob/sketch_to_image/research/gan/cifar/train.py

karmel commented 6 years ago

I would try to make the cats case as similar to the cifar10 case as possible, because it's easy to imagine that the differences you describe are resulting in a model that is parameterized differently, resulting in missing layers and variables. For example, if you run your cifar10 data all with a single class, what happens?

In any case, assigning to @joel-shor , but it sounds like there is some debugging work to be done here on your side to reduce the problem to a smaller set of differences between the two cases.

SHANKARMB commented 6 years ago

Ok.. I will try it and give you an update.

SHANKARMB commented 6 years ago

I don't think I can modify cifar10 dataset that easily, because they are binary format.. they are read and converted to tfrecord format and stored..

But in my case I read images and convert that into tfrecord format..

Repo link https://github.com/SHANKARMB/models/tree/sketch_to_image/research/gan/cifar/datasets


file name: convert_gan_images.py

joel-shor commented 6 years ago

What version of TensorFlow are you using? Are you using a standard version, or a nightly build?

SHANKARMB commented 6 years ago

I'm using 1.6.0 standard version

joel-shor commented 6 years ago

Rather than post a link to your github account, you'll need post a code snippet demonstrating your error. Once you do so, I can help diagnose the issue.