gabrieleilertsen / hdrcnn

HDR image reconstruction from a single exposure using deep CNNs
https://computergraphics.on.liu.se/hdrcnn/
BSD 3-Clause "New" or "Revised" License
502 stars 101 forks source link

Loss function for Pytorch reimplementation #32

Closed t2ac32 closed 3 years ago

t2ac32 commented 4 years ago

Hi I've been trying to implement the cost function on my network in pytorch.

Right now I'm having some problems with image dimension matching for cost_input_output computation. QUESTION:

How many channels are the input Ldr Images suppose to have and how many for the HDR labels?

I'm using a similar method to _'load_trainigpair' and I am getting this array shape for y: label y shape : (13, 810701056, 0)

My specific problem is when computing the subtraction between ylog and x_log when getting cost_input_output.

Thanks on Advance !

gabrieleilertsen commented 4 years ago

Hi. The HDR and LDR images both have 3 channels. cost_input_output in the training code is used only for comparing how large the improvement is as compared to doing nothing. cost is used for the training. The size of the arrays are (batch_size, height, width, channels), i.e. 4D arrays. I guess that the problem is in how the training pairs are loaded, as shape (13, 810701056, 0) doesn't make sense.

t2ac32 commented 4 years ago

Thanks @gabrieleilertsen for your answer.

Apparently my problem was that I was trying to convert .HDR images. When in _load_trainingpair the original code receives .bin's and i was using . .HDR's.

Also loading binaries didn't work so i ended up Loading the .HDR and converting to arrays using cv2 (Open CV library).

Question The "cost" obtained in the loss function formulation is suppose to be my loss? This confuses me since it just goes up overall. On the paper LOSSES are reported as 0.99 (worst) & 0.159 (best). in the order of 1*10^2.

After obtaining "cost" i get: cost = 1.2078159898020035e-05 = 0.00001207 Train Loss:0.000013012
Train loss = cost / step

I don't know if this problem looks familiar, i already checked that my loss function worked equally in tensorFlow as in torch.

As reminder my inputs are LDR's pngs and the grnd truths are .hdr loaded with CV2 and converted to torch tensors.

EDIT: Probably some kind of normalization is needed? After some thinking i checked my min and max pixel values. I noticed that my input and prediction pixel values go from 0-1, while my labels (.hdr) after reading with open cv have pixel values of: min: ~0.4 max px value: These vary a lot but I've seen as low as 17 up to 97.