dvlab-research / outpainting_srn

Wide-Context Semantic Image Extrapolation, CVPR2019
MIT License
132 stars 32 forks source link

How to resume training #9

Open lilili666-hit opened 4 years ago

lilili666-hit commented 4 years ago

when i was training a lot time ,and i have stop it ,can you tell me how to resume training? And the petrain_network is equal to 1or 0?

shepnerd commented 4 years ago

Using the option --load_model_dir [Your model path] to continue the training. The option --pretrain_network 0|1 is to decide whether using the all training losses or just the reconstruction loss.

lilili666-hit commented 4 years ago

thank you very much. I have another question for you, why G_ Does loss have a negative value? After a period of time, it will output a negative value and then change to a positive value.

lilili666-hit commented 4 years ago

Hi Yi Wang ,Can you share the place2 pretrained model ,thank you very much。

shepnerd commented 4 years ago

It happens as the loss for updating generator in WGAN-GP could be negative. thank you very much. I have another question for you, why G_ Does loss have a negative value? After a period of time, it will output a negative value and then change to a positive value.

You required model (Places2) can be downloaded from here. You can use it by python test.py --dataset cityscape --data_file TEST_IMAGE_FOLDER --load_model_dir checkpoints/places2-srn-subpixel-gc64 --model srn --feat_expansion_op subpixel --use_cn 1 --random_crop 0 --random_mask 0 --img_shapes 256,512,3 --mask_shapes 256,256 --g_cnum 64 right after you put the pretrained model folder in ./checkpoints.

lilili666-hit commented 4 years ago

when i was testing ,i have met another problem. ValueError: Dimension 3 in both shapes must be equal, but are 256 and 1024. Shapes are [3,3,64,256] and [3,3,64,1024]. for 'Assign_30' (op: 'Assign') with input shapes: [3,3,64,256], [3,3,64,1024].

lilili666-hit commented 4 years ago

12312313 this is my training detail 。thank you very much!

lilili666-hit commented 4 years ago

The picture size of place2 is 256 256. How do you use 256 512 for training? Do you directly change the fractional rate of 256 256 pictures to 256 512?

shepnerd commented 4 years ago

When using the pretrained model on Places2, please set the image and mask shapes by --img_shapes 256,512,3 --mask_shapes 256,256. About the image resolution in Places2, we use the data in the original resolution instead of the resized one.

lilili666-hit commented 4 years ago

Can this program be fine-tuned? For example, import a pre-trained model, then freeze certain layers and then train.

shepnerd commented 4 years ago

Sure. If you wanna freeze some specific layers in the generator, you can remove them by their names in g_vars, then these parameters will not be updated in the following training.

lilili666-hit commented 4 years ago

Can you give me an example? thank you very much.

lilili666-hit commented 4 years ago

Hi Yi Wang. What applications do you think this technology has in real life? For example, what kind of practical problems can be solved?

lilili666-hit commented 4 years ago

I am training from the newly selected data set. To what extent can the first stage of training be identified as convergence?thanks for your help

shepnerd commented 4 years ago

Hi Yi Wang. What applications do you think this technology has in real life? For example, what kind of practical problems can be solved?

Extending images or videos naturally to fit the display device could benefit from such technology. Someone has explored its video application as here.

shepnerd commented 4 years ago

I am training from the newly selected data set. To what extent can the first stage of training be identified as convergence?thanks for your help

We can ensure the training is converged when the reconstruction loss seems stable in the first stage. Quantitatively, for a relatively small-scale dataset (e.g. Paris street view, cityscapes) contains 2k~12k training images, 80000 iterations with batch size 16 should be enough (larger batch size may require fewer training iterations).

The two-stage training is actually a compromise due to the used network only has a small capacity (no more than 4M parameters). If training the model with a large capacity equipped with residual blocks (like SPADE or Pix2pixHD) and new GAN stable tricks (spectral norm, multiple-scale, patchgan, condition projection, etc), it may be trained directly with vgg loss and adversarial loss from scratch.

lilili666-hit commented 4 years ago

11111111111111111111111

Is the loss function image of this discriminator normal? I trained from scratch with a newly selected dataset.

lilili666-hit commented 4 years ago

Can you send me all the loss curves of your training at that time? thanks.

shepnerd commented 4 years ago

Can you send me all the loss curves of your training at that time? thanks.

I will search my server for these data and get back to you later.

shepnerd commented 4 years ago

11111111111111111111111

Is the loss function image of this discriminator normal? I trained from scratch with a newly selected dataset.

At least the loss tendency of discriminator should be oscillating instead of converging.

Note it would better to train with aligned data (or data with similar layouts, e.g., aligned face or cityscape-like data). If not, using a bigger model / pretraining / gan stabilization tricks for training.

lilili666-hit commented 4 years ago

Can you send me all the loss curves of your training at that time? thanks.