Can i have your hyperparameter for HDR unpair learning? i wrote the training code in tensorflow by myself and had been training the model for 1 day (300k steps), but the visualize result is still very far from what you show in paper.
i also modified your architecture by using global mean pooling instead of a big convolutional receptive field to make the feature map into 1 x 1. This allow me to take in 256 x 256 images. i used Titan XP (11 GB memory) to train the model. basically this is impossible for me to train D_A, D_B, G_A and G_B at the same time with 512 x 512 inputs size. so i used 256 x 256 inputs size, and train G_A and G_B separately.
In the paper, you use a adaptive weighting schema to adjust the weight for gradients penalty. However, in your code, you only do increasing the weight when the moving average is larger then upper bound but not decreasing the weight when the moving average is smaller than lower bound. should i be worry about this?
i am not familiar with WGAN-GP, hence i am not sure i am doing it correctly or not.
** in my code for optimizing WGAN-GP, real image is positive and fake image is negative.
Here is the graph for 300 K +- steps for gp_A and gp_B
For NetD_A and NetD_B , the loss for both of them are still going up, is that indicated that the discriminator is not learning correctly?
for NetG_A2B_adv_loss and NetG_B2A_adv_loss, i am not sure i read this graph correctly. NetG_A2B is bad as critic value is small and netG_B2A is good as critic value is big?
For data source constraint loss and data term constraint loss, i think they are doing pretty well. i have no problem with that.
Can i have your hyperparameter for HDR unpair learning? i wrote the training code in tensorflow by myself and had been training the model for 1 day (300k steps), but the visualize result is still very far from what you show in paper.
--generator_learning_rate 0.0001 \ --discriminator_learning_rate 0.0001 \ --batch_size 2 \ --netG_regularization_weight 0 \ --netD_regularization_weight 0 \ --input_size 256 \ --loss_source_data_term_weight 1e3 \ --loss_constant_term_weight 1e4 \ --gp_weight_A 10 \ --gp_weight_B 10 \ --global_gradient_clipping 1e8 \ --update_netD_times 50 \ --loss_wgan_gp_mv_decay 0.99 \ --loss_wgan_gp_bound 5e-2 \ --netD_buffer_times 50 \ --loss_wgan_lambda_grow 2
i also modified your architecture by using global mean pooling instead of a big convolutional receptive field to make the feature map into 1 x 1. This allow me to take in 256 x 256 images. i used Titan XP (11 GB memory) to train the model. basically this is impossible for me to train D_A, D_B, G_A and G_B at the same time with 512 x 512 inputs size. so i used 256 x 256 inputs size, and train G_A and G_B separately.
In the paper, you use a adaptive weighting schema to adjust the weight for gradients penalty. However, in your code, you only do increasing the weight when the moving average is larger then upper bound but not decreasing the weight when the moving average is smaller than lower bound. should i be worry about this?
i am not familiar with WGAN-GP, hence i am not sure i am doing it correctly or not.
** in my code for optimizing WGAN-GP, real image is positive and fake image is negative.
Here is the graph for 300 K +- steps for gp_A and gp_B
For NetD_A and NetD_B , the loss for both of them are still going up, is that indicated that the discriminator is not learning correctly?
for NetG_A2B_adv_loss and NetG_B2A_adv_loss, i am not sure i read this graph correctly. NetG_A2B is bad as critic value is small and netG_B2A is good as critic value is big?
For data source constraint loss and data term constraint loss, i think they are doing pretty well. i have no problem with that.
Do u mind to share your experience with me?
Thanks!