Difficult to output larger images

hswaffield commented 6 years ago

After running your code successfully to produce 64 x 64 images, I'm trying to output larger images, 128 x 128, but I have been running into a number of problems.

The first is that there are a lot of magic numbers, with no explanation of what they mean: e.g. "844dim2" - I tried multiplying dim by 4 here (the square of 2x is 4 * square of x) and that seems to work, but I'm just guessing because it's not clear where the numbers come from.

There are many areas where the number 64 appears, and it's not clear if it really should be 64, or if it should be the dimension.

To output 128 x 128 images, does the model dimensionality need to be 128?

other issues: raise ValueError("GraphDef cannot be larger than 2GB.") ValueError: GraphDef cannot be larger than 2GB. ... I'm running this on a Titan Xp GPU, which has 12 GB of memory, so there surely has to be a way that I'm not limited by this constraint. Any idea of that that would be?

this occurs at:

File "GANGogh/GANgogh.py", line 378, in _x_r = session.run(real_data, feed_dict={all_real_data_conv: _x})

I'm new to tensor flow so this is rather scary and intimidating.

So any guidance on how to simply change the code so that it can output larger dimension images would be greatly appreciated, because it seemed initially that the task would be straightforward, only requiring a few variable changes, relating to DIM, OUTPUT_DIM, ETC...

I think this project is really cool, and really appreciated your medium article... it's what first exposed me to GANs.

Any feedback on how to approach this would be greatly appreciated.

Thanks

zacharynevin commented 6 years ago

@hswaffield I am working on a fork of this repository that massively cleans up the code: https://github.com/zacharynevin/GANGogh/tree/release/custom.

I have tried my best to remove most of the magic numbers. In addition, I have given it the following advantages:

The generator and discriminator layers are written to accommodate training with variable image sizes. You should be able to run a training routine with any square image dimensions that you wish (provided you have enough GPU memory). The number of convolution/deconvolution layers will be adjusted depending on your desired image size.
The dataset iterator has been abstracted away from the model architecture using the tf.Dataset api, with the only requirement being that your dataset exists in TFRecord format (I will document this). In addition, you can use remote data sources (e.g. from a google storage bucket) just as easily as you can local sources, making it extremely easy to train on datasets that are 100s of GB in size without requiring a time consuming and computationally expensive "data warming" stage.
Native Tensorboard integration. This is important for tracking real-time progress of your model, especially if you are using VMs on AWS or Gcloud where periodically downloading the log files from the vm using scp in order to view plots etc may not be practical. Instead, you can just do tensorboard --logdir=/path/to/logs.
Replaced a lot of routines with tensorflow slim in order to make the code cleaner.
Also, used resize convolutions in the generator instead of convolution2d_transpose to remove the possibility of checkerboard artifacts.
Proper use of tf.variable_scope to make it easy to dissect the graph in Tensorboard.

This is still a work in progress and hasn't been tested, so any commits or issues are appreciated (especially where layer dimensions are concerned).

hswaffield commented 6 years ago

Thanks for your comment! There is a lot in there that I'm really excited to hear that you have done. I'm trying to use it now. Hopefully TFRecords are straight forward to setup

zacharynevin commented 6 years ago

Hi @hswaffield, just a quick warning that I haven't tested this yet. Its still very much a work in progress. However, any PRs with fixes would be welcome.

Also, TFRecords are very straightforward to set up. Here is an example gist: https://gist.github.com/zacharynevin/d9c6aa21a2d52299dfc56c12804d6770

orestis-z commented 2 years ago

@hswaffield I am working on a fork of this repository that massively cleans up the code: https://github.com/zacharynevin/GANGogh/tree/release/custom.

I have tried my best to remove most of the magic numbers. In addition, I have given it the following advantages:

The generator and discriminator layers are written to accommodate training with variable image sizes. You should be able to run a training routine with any square image dimensions that you wish (provided you have enough GPU memory). The number of convolution/deconvolution layers will be adjusted depending on your desired image size.

The dataset iterator has been abstracted away from the model architecture using the tf.Dataset api, with the only requirement being that your dataset exists in TFRecord format (I will document this). In addition, you can use remote data sources (e.g. from a google storage bucket) just as easily as you can local sources, making it extremely easy to train on datasets that are 100s of GB in size without requiring a time consuming and computationally expensive "data warming" stage.

Native Tensorboard integration. This is important for tracking real-time progress of your model, especially if you are using VMs on AWS or Gcloud where periodically downloading the log files from the vm using scp in order to view plots etc may not be practical. Instead, you can just do tensorboard --logdir=/path/to/logs.

Replaced a lot of routines with tensorflow slim in order to make the code cleaner.

Also, used resize convolutions in the generator instead of convolution2d_transpose to remove the possibility of checkerboard artifacts.

Proper use of tf.variable_scope to make it easy to dissect the graph in Tensorboard.

This is still a work in progress and hasn't been tested, so any commits or issues are appreciated (especially where layer dimensions are concerned).

Hi @zacharynevin, do you still have the code around?

zacharynevin commented 2 years ago

@orestis-z I don't. For some reason I cleaned up this repository during a purge of old code that I never worked on. Sorry about that! I promise to be a better open-source citizen.

However, here is another repository I didn't delete that uses essentially the equivalent coding style: https://github.com/zacharynevin/StackGAN. In fact to make this repo I think I just copied my GANGogh implementation.

BikramjeetSingh commented 2 years ago

@zacharynevin Any chance you can reupload your fork if you still have the code available somewhere? Thanks.

rkjones4 / GANGogh

Difficult to output larger images #11