JiahuiYu / generative_inpainting

DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral
http://jiahuiyu.com/deepfill/
Other
3.27k stars 787 forks source link

Questions occured when training from scratch on places2 #195

Closed cmyyy closed 5 years ago

cmyyy commented 5 years ago

Hi,@JiahuiYu .I have some questions about training from scratch on places2. Q1:i set parameters as follows.Is it ok?FYI:i use high-resolution pictures ,and i have one available gpu. Q2:What's the meanings of GAN_WITH_MASK and DISCOUNTED_MASK?When should i use them? DATASET: 'places2' # 'tmnist', 'dtd', 'places2', 'celeba', 'imagenet', 'cityscapes' RANDOM_CROP: True VAL: False LOG_DIR: full_model_places2_512 MODEL_RESTORE: '' # '20180115220926508503_jyugpu0_places2_NORMAL_wgan_gp_full_model'

GAN: 'wgan_gp' # 'dcgan', 'lsgan', 'wgan_gp', 'one_wgan_gp' PRETRAIN_COARSE_NETWORK: False GAN_LOSS_ALPHA: 0.001 # dcgan: 0.0008, wgan: 0.0005, onegan: 0.001 WGAN_GP_LAMBDA: 10 COARSE_L1_ALPHA: 1.2 L1_LOSS_ALPHA: 1.2 AE_LOSS_ALPHA: 1.2 GAN_WITH_MASK: False DISCOUNTED_MASK: True RANDOM_SEED: False PADDING: 'SAME'

NUM_GPUS: 1 GPU_ID: -1 # -1 indicate select any available one, otherwise select gpu ID, e.g. [0,1,3] TRAIN_SPE: 10000 MAX_ITERS: 1000000 VIZ_MAX_OUT: 10 GRADS_SUMMARY: False GRADIENT_CLIP: False GRADIENT_CLIP_VALUE: 0.1 VAL_PSTEPS: 1000

cmyyy commented 5 years ago

Q3:Have u ever met the problem as follows,and how u solve it? [INFO 2018-12-15 20:39:18 @data_from_fnames.py:153] image is None, sleep this thread for 0.1s. Exception in thread Thread-3: Traceback (most recent call last): File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/site-packages/neuralgym/data/feeding_queue_runner.py", line 194, in _run data = func() File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/site-packages/neuralgym/data/data_from_fnames.py", line 143, in feed_dict_op=[lambda: self.next_batch()], File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/site-packages/neuralgym/data/data_from_fnames.py", line 180, in next_batch random_h, random_w, align=False) # use last rand File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/site-packages/neuralgym/ops/image_ops.py", line 50, in np_random_crop image = np_scale_to_shape(image, shape, align=align) File "/home/Victor/anaconda3/envs/tf1.7/lib/python3.6/site-packages/neuralgym/ops/image_ops.py", line 23, in np_scale_to_shape imgh, imgw = image.shape[0:2] AttributeError: 'NoneType' object has no attribute 'shape'

cmyyy commented 5 years ago

Q4,As u mentioned in #75 ,u already released the implementation of sn-gan loss in dev branch in neuralgym. Does that mean i could use sn-gan loss by simply set the "GAN:"in .yml to "GAN:'sn_gan'"?

JiahuiYu commented 5 years ago

Hi, I do recommend you read our code first. And I do think all of your questions will be clear after reading.

If there is anything unclear or confused after you understand how this code is organized, please ask then. I will be happy to address your questions in that case.

For question 3, please search keyword in related issues first. I have addressed the question for more than twice.

For question 4, no you cannot. You will need to implement sn GAN by yourself.

cmyyy commented 5 years ago

Thanks for your answers and kindly advice.

cmyyy commented 5 years ago

Hi,@JiahuiYu . I wonder whether the model restoring succeeded or not,because there are ALL ZEROS.

[2018-12-20 09:39:52 @logger.py:43] Trigger callback: Trigger ModelRestorer: Load model from model_logs/20181218002526969453_cmy_places2_NORMAL_wgan_gp_full_model_places2_512/snap-90000. [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: beta1_power:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: beta2_power:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv1/bias/Adam:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv1/bias/Adam_1:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv1/bias:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv1/kernel/Adam:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv1/kernel/Adam_1:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv1/kernel:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv2/bias/Adam:0 [2018-12-20 09:39:52 @model_restorer.py:60] - restoring variable: discriminator/discriminator_global/conv2/bias/Adam_1:0

JiahuiYu commented 5 years ago

Zeros indicate index. You should be able to find the answer by yourself. It is easy as long as you can investigate the code.

cmyyy commented 5 years ago

Hello, @JiahuiYu .I have two small questions. Q1:Could u give me some advice on the choice of dataset considering a quick know to the validity of a new idea/model ? Q2:For all datasets you mentioned in the paper, you use the exactly same hyperparameters to train? Thanks a lot!

JiahuiYu commented 5 years ago

@cmyyy Hey,

For Q1: I use celeba-HQ dataset.

For Q2: yes exactly same, except on places2, imagenet we random crop 256x256 for training, for celeba-HQ we rescale 1024x1024 to 256x256 for training. All parameters are provided in our released config file.

cmyyy commented 5 years ago

Hello,@JiahuiYu. In the code,the number of filters extracted from backgrounds is hw,but in the deepfill1 paper, image. I guess 12288 = 128 128 - 64 64. Doesn't it contradict with code? Which is right? Thanks!

JiahuiYu commented 5 years ago

Please follow the code. In paper we report a theoritical number. In code we provide an simple and practical implementation.

cmyyy commented 5 years ago

So you mean the number of convolutional filters extracted from background in code doesn't equal to theoretical number ,right?

JiahuiYu commented 5 years ago

Yes.