creotiv / hdrnet-pytorch

Unofficial PyTorch implementation of 'Deep Bilateral Learning for Real-Time Image Enhancement', SIGGRAPH 2017 https://groups.csail.mit.edu/graphics/hdrnet/
227 stars 45 forks source link

activation function of guidance map #10

Open gybuaa opened 3 years ago

gybuaa commented 3 years ago

the original official code used 'sigmoid' as the second 1*1 conv's activation function,but this code used 'tanh'.Does it perform better than official version? thanks for your great job!

creotiv commented 3 years ago

So tanh was here because i though that grid_sample needs input in [-1,1] (because they said so in docs), but as i understand that was bad intuition. Right now i use Sigmoid function for guide. And seems it started to work at last.

gybuaa commented 3 years ago

So tanh was here because i though that grid_sample needs input in [-1,1] (because they said so in docs), but as i understand that was bad intuition. Right now i use Sigmoid function for guide. And seems it started to work at last.

Thanks for your reply!I have seen it in the newest version.So have you trained it on the fivek's 5000 images? I generate 5000 .jpg format images and I am about to doing some modification on the architecture of this net. I think that there would be some more direct way to get the full-res [B,12,H,W] affine coefficients.I found that this model tend to make images brighter,losing original true and delicate color.Do you plan to share your training results in the future? Hoping futher communication with you!

creotiv commented 3 years ago

There is also paper about deepguidedfilter (you can find it on github), all other things tend to be much bigger and slower. Yeah im planning to add trained model with some results.

On Thu, Dec 3, 2020, 8:57 AM gybuaa notifications@github.com wrote:

So tanh was here because i though that grid_sample needs input in [-1,1] (because they said so in docs), but as i understand that was bad intuition. Right now i use Sigmoid function for guide. And seems it started to work at last.

Thanks for your reply!I have seen it in the newest version.So have you trained it on the fivek's 5000 images? I generate 5000 .jpg format images and I am about to doing some modification on the architecture of this net. I think that there would be some more direct way to get the full-res [B,12,H,W] affine coefficients.I found that this model tend to make images brighter,losing original true and delicate color.Do you plan to share your training results in the future? Hoping futher communication with you!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-737707569, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDNKBI7LUMOTD6TOTDDSS4ZF5ANCNFSM4UKAIMRA .

gybuaa commented 3 years ago

Hi,have you added some data augumentation?I am training it on 32G tesla V100.I find that this model is so small,but training work is very difficult.The metrics psnr on both train dataset and test dataset grow too slowly.I dont understand that since the model have few training parameters (just several conv layers and fc layers),why the training work is so hard?

creotiv commented 3 years ago

if training is hard this means that you have big variance in your dataset. Basically this model should work for 1 situation. It cant handle all the variations, So check your data, first. On small variance it should give PSNR over 25 in a hour or less

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Wed, Dec 9, 2020 at 8:22 AM gybuaa notifications@github.com wrote:

Hi,have you added some data augumentation?I am training it on 32G tesla V100.I find that this model is so small,but training work is very difficult.The metrics psnr on both train dataset and test dataset grow too slowly.I dont understand that since the model have few training parameters (just several conv layers and fc layers),why the training work is so hard?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-741559889, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDMJ3W2AS53B36DQX5LST4JT7ANCNFSM4UKAIMRA .

gybuaa commented 3 years ago

I just trained it on paper's 5000 fivek dataset,and use 500 of them as validate dataset.I see that paper's some final results can be up to 30 psnr after 2-3 days training job.Until now I only get 15 psnr for 24 hours.The results seem to be just brighter than input images and some results dont work.Did you have reimplemented the results of this paper?And how long have you trained? Thanks a lot for your reply and patience!

creotiv commented 3 years ago

What params did you use to run training?

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Thu, Dec 10, 2020 at 11:19 AM gybuaa notifications@github.com wrote:

I just trained it on paper's 5000 fivek dataset,and use 500 of them as validate dataset.I see that paper's some final results can be up to 30 psnr after 2-3 days training job.Until now I only get 15 psnr for 24 hours.The results seem to be just brighter than input images and some results dont work.Did you have reimplemented the results of this paper?And how long have you trained? Thanks a lot for your reply and patience!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-742392615, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDJ2SHVCIRUWSNJRAGLSUCHDVANCNFSM4UKAIMRA .

creotiv commented 3 years ago

Did you use the latest code from master?

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Thu, Dec 10, 2020 at 5:01 PM Андрей Никишаев creotiv@gmail.com wrote:

What params did you use to run training?

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Thu, Dec 10, 2020 at 11:19 AM gybuaa notifications@github.com wrote:

I just trained it on paper's 5000 fivek dataset,and use 500 of them as validate dataset.I see that paper's some final results can be up to 30 psnr after 2-3 days training job.Until now I only get 15 psnr for 24 hours.The results seem to be just brighter than input images and some results dont work.Did you have reimplemented the results of this paper?And how long have you trained? Thanks a lot for your reply and patience!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-742392615, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDJ2SHVCIRUWSNJRAGLSUCHDVANCNFSM4UKAIMRA .

gybuaa commented 3 years ago

Yes,I use the latest code from master.I just use the default parameters as paper did: 1e-4 learning rate,adam optimizer and default momentum settings.Because i use 32G gpu,so i add batchsize up to 32 and 64.

creotiv commented 3 years ago

try to limit batch size 64 is to much. try set 8 and luma, spatial bins 8 16

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Thu, Dec 10, 2020 at 6:44 PM gybuaa notifications@github.com wrote:

Yes,I use the latest code from master.I just use the default parameters as paper did: 1e-4 learning rate,adam optimizer and default momentum settings.Because i use 32G gpu,so i add batchsize up to 32 and 64.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-742641775, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDODGR7WD6RLTEJ6I73SUD3HRANCNFSM4UKAIMRA .

gybuaa commented 3 years ago

I am not sure what you mean about 'luma' and spatial bins?the low-res coefficient's output shape of bilateral grid is [B,12*8,H,W],is spatial bins grid's channel '8' and change it to 16 ?And what 'luma' refers?I will try it again!Thanks :)

creotiv commented 3 years ago

Luma bins - its grid dimension by value, spatial bins - grid dimensions by position (x,y)

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Thu, Dec 10, 2020 at 8:45 PM gybuaa notifications@github.com wrote:

I am not sure what you mean about 'luma' and spatial bins?the low-res coefficient's output shape of bilateral grid is [B,12*8,H,W],is spatial bins grid's channel '8' and change it to 16 ?And what 'luma' refers?I will try it again!Thanks :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-742718239, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDLQ62TSB243QL3YHGDSUEJLRANCNFSM4UKAIMRA .

gybuaa commented 3 years ago

Hi.Did you find that pytorch's "grid_sample" function is too slow? I tested that this function would cost 0.2~0.4 seconds per image,which means there's still distance for real time video processing...

creotiv commented 3 years ago

did you run it on gpu? cause on my device it pretty fast, much much less than 0.2s

Kindly yours, Andrey Nikishaev

Areas ML/DS/CV/Soft Dev/BizDev/Growth Hacking/Customer Rel/IT LinkedIn http://ua.linkedin.com/in/creotiv GitHub http://github.com/creotiv Slideshare https://www.slideshare.net/anikishaev/ Skype creotiv.in.ua Mobile +380632410666

On Tue, Dec 22, 2020 at 1:13 PM gybuaa notifications@github.com wrote:

Hi.Did you find that pytorch's "grid_sample" function is too slow? I tested that this function would cost 0.2~0.4 seconds per image,which means there's still distance for real time video processing...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-749488441, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDKIL227VP3A7T4DULLSWB5NXANCNFSM4UKAIMRA .

gybuaa commented 3 years ago

Hi,considering that trilinear interpolation needs (x,y,z) coordinates which should be scaled to [-1,1],but sampled 'z' which is generated from 1*1 point-wise NN and 'sigmoid' activation function is a positive number.Does it means that our luma bins' first 4 channel numbers is not used?Do you think it appropriate ?

QiuJueqin commented 3 years ago

Hi,considering that trilinear interpolation needs (x,y,z) coordinates which should be scaled to [-1,1],but sampled 'z' which is generated from 1*1 point-wise NN and 'sigmoid' activation function is a positive number.Does it means that our luma bins' first 4 channel numbers is not used?Do you think it appropriate ?

Same thought. Using nn.Tanh as activation should make more sense.

creotiv commented 2 years ago

ive added bilateral_slice from original repo compiled for jit. But still has some problems with optimization for some reason. So i think grid_sample was working correctly

QiuJueqin commented 2 years ago

After some comparison with my customized tri-linear interpolation, which consists of multiple 2D bilinear interpolation, I'm now pretty sure that the second argument to F.grid_sample (grid) should be something like

torch.cat([wg, hg, guidemap], dim=3).unsqueeze(1)

instead of

torch.cat([hg, wg, guidemap], dim=3).unsqueeze(1)

Furthermore, elements in grid along all axes should be in [-1, 1] range, not [0, 1], which means in the guidance net, the activation should be torch.tanh, instead of torch.sigmoid.

The result of my customized slicing oprator is very similar to the F.grid_sample with inputs formatted mentioned above. The abs error is smaller than 1E-5:

all close with atol=1E-6:  False
all close with atol=1E-5:  True
Varato commented 2 years ago

I just trained it on paper's 5000 fivek dataset,and use 500 of them as validate dataset.I see that paper's some final results can be up to 30 psnr after 2-3 days training job.Until now I only get 15 psnr for 24 hours.The results seem to be just brighter than input images and some results dont work.Did you have reimplemented the results of this paper?And how long have you trained?

Hi, I'm trying the model recently. I think the fundamental difficulty of training this model is that the guide prediction net and the bilateral grid (the low-res affine coeffs) update simultaneously. The bilateral grid is like a dictionary and the guide is like keys to look up in the dictionary. When updating together, they have to adapt each other constantly. I can only get psnr about 18 on FiveK. Don't know how to improve.

creotiv commented 2 years ago

This is identical copy(i hope) of original papers code, but for some reason there some problems with training. I tried to find problem for a long time, but with no result. Grid working fine, networks really simple, so really dont have any clue. And now we have war here so its no time on ml stuff any more(

On Wed, May 25, 2022, 8:28 AM Xin Chen @.***> wrote:

I just trained it on paper's 5000 fivek dataset,and use 500 of them as validate dataset.I see that paper's some final results can be up to 30 psnr after 2-3 days training job.Until now I only get 15 psnr for 24 hours.The results seem to be just brighter than input images and some results dont work.Did you have reimplemented the results of this paper?And how long have you trained?

Hi, I'm trying the model recently. I think the fundamental difficulty of training this model is that the guide prediction net and the bilateral grid (the low-res affine coeffs) update simultaneously. The bilateral grid is like a dictionary and the guide is like keys to look up in the dictionary. When updating together, they have to adapt each other constantly. I can only get psnr about 18 on FiveK. Don't know how to improve.

— Reply to this email directly, view it on GitHub https://github.com/creotiv/hdrnet-pytorch/issues/10#issuecomment-1136742703, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB5CDN6LPWXZTCKL32EEBTVLW27BANCNFSM4UKAIMRA . You are receiving this because you commented.Message ID: @.***>