andreas128 / SRFlow

Official SRFlow training code: Super-Resolution using Normalizing Flow in PyTorch
Other
824 stars 112 forks source link

How to generate a SR image with only 1 channel? #16

Open PattonYu opened 3 years ago

PattonYu commented 3 years ago

Thanks for your code and contributions! There is a task that generate a SR image, but I just need a grayscale generation(that means the generation data has only 1 channel), the origin code can only generate a 3 channel's RGB images. So how to generate the 1 channel data?

andreas128 commented 3 years ago

Thanks for your interest! Please try to use the following values in the config file.

  in_nc: 1
  out_nc: 1

Does that solve you problem?

PattonYu commented 3 years ago

I tried this, but still not work. The SR data generated in FlowUpsamplerNet.py still has 3 channels.

andreas128 commented 3 years ago

How to change the number of output channels for SRFlow?

To change the output dimension, you need to change the architecture of the Normalizing Flow.

At the moment, the output of SRFlow has three channels. During training, this image is squeezed in the first scale level as shown below or in the paper. This means that the feature dimensions are then 3x(2x2)=12. Since you want to work on one channel, you could try to add a squeeze layer before the first scale level here. This way, you should have 1x(2x2)x(2x2)=16 channels in the first scale level. It's best to use the arch_sqeeze function, since it calculates the number of channels for the other layers for you.

Normalizing Flow Pixel Shuffling

Simple alternative

If you do not want to change the architecture, you could try to feed a 3 channel image by stacking copies of your single channel.

How to change the number of input channels for SRFlow?

Change the in_nc in the config file to the number of input channels.


Did that help? Do you think this description is enough for others to follow?

PattonYu commented 3 years ago

Thanks for your careful reply. The simple alternative you give is just like in your code. Its just make the 1 channel images trainable, but not change the number of output channels. I'm not sure if the 3 channel data generated this way is similar to the data generated by 1 channel. And your description is very detailed, thanks a lot.

andreas128 commented 3 years ago

Thanks, so did the architecture change work for you?

PattonYu commented 3 years ago

I did not change the architecture yet, I found I can reset the initial parameter here, I am verifying this, and the training process is slower than in 3 channel ouput.

andreas128 commented 3 years ago

Okay great! Then you will have 1x(2x2)=4 channels in the first scale level. This means that the conditional affine image injector will extract features from only 2 channels and applies the shift and scale to the two other channels.

In case you run into problems with this settings, please try to add another squeeze layer.

andreas128 commented 3 years ago

Did you find a satisfying solution? What did you try so far?

PattonYu commented 3 years ago

Hi, sorry for the late reply, I found that resetting the initial parameter didn't work very well, when the validation results of the training process went in a good result, the generated data was all NaN. I tried to add the squeeze layer, so then I had 1x (2x2) x (2x2) = 16 channels in the first scale level, but this didn't work and the generated data still all NaN. I have another question, what should I do if I just work with 2x scale?