cchen156 / Learning-to-See-in-the-Dark

Learning to See in the Dark. CVPR 2018
http://cchen156.web.engr.illinois.edu/SID.html
MIT License
5.45k stars 842 forks source link

how to test the model on my own dataset #75

Open AksChunara opened 5 years ago

cchen156 commented 5 years ago

If you need to test your JPG images from another camera, that is not supported.

noamgot commented 5 years ago

@cchen156 what about testing it on a raw image (specifically - NEF image, shot using Nikon camera)? I want to see if it works on my image.

gaseosaluz commented 5 years ago

@noamgot assuming that the sensor type from your camera is one of the types covered by this work (my guess is that you likely have a Bayer sensor as it is the most common - all of my cameras have this type) and that the NEF format is readable by RawPy you should be able to try this. Note that you will have set the amplification factor (discussed in paper) and you may need to adjust the black level (this is set to a fixed value in the code, but you can read it with RawPy).

Once you have this you can run your image through the system and check the results. Note that the authors state that this may not generalize well. However I have had good results trying this with RAW iPhone X images (I did have to modify the code in several places - but only the image manipulation and preparation, the rest is fine and will work)

I hope this is helpful.

noamgot commented 5 years ago

Thanks! I changed the black level constants and now it works well.

l0stpenguin commented 5 years ago

@gaseosaluz I had tried with some RAW images from iphone X too but not all results were good. Can you elaborate on the parts you modified?

gaseosaluz commented 5 years ago

@mevinDhun. The first thing that you have to do is to make sure that you use RAW images. I just want to make sure that you are doing that. I captured my RAW images with an iPhone Xs using the VSCO app (https://apps.apple.com/us/app/vsco/id588013838). You don't have to use it, but I happen to know that this app will generate DNG files that RawPy will open. I assume that you have checked this, but I just wanted to tell you that part of my setup.

The code modifications that I made were minor. I have extensively refactored the code for my own use and converted it so that it works with Jupyter Notebooks, but in reality you do not need to do that. The original code that is provided does work as is if you keep the following in mind:

im = np.maximum(im - 512, 0) / (16383 - 512) # subtract the black level

The 512 is the black level. You can guess at the number or use RawPy to read it from the image. black_level = raw.black_level_per_channel[0] # assume they're all the same

This will get you the black level for one of the channels.

I hope this is helpful.

Screen Shot 2019-06-28 at 10 41 36 AM
aasharma90 commented 5 years ago

Hi @noamgot,

Could you please tell me how you did that? I too have some NEF files I would like to test this code on. Thanks in advance!

noamgot commented 5 years ago

@aasharma90 sure - I thought that I already explained, but now I see I didn't :)

So - I found the black level for each channel and the maximal value using this code segment (taken from this comment in another issue):

>>> import rawpy
>>> pic = rawpy.imread('long/00001_00_10s.ARW')
>>> pic.black_level_per_channel
[512, 512, 512, 512]
>>> np.max(pic.raw_image)
16383

I copied the exact same example, but what you shold do is load your own raw image (using rawpy.imread) and proceed in the same way.

Then, in test_Sony.py (that's the file I use because I have a Nikon camera which has a Bayer sensor), I changed the numbers in pack_raw to the corresponding numbers that I got (in my case it was a black level of 600, and a maximal pixel value of 16350).

You may use the Google Colab notebook I worked this - it generally clones the git repository, downloads the models, and given a raw input image you provide it uses the trained model to enlight this image and save it: https://colab.research.google.com/drive/12oj5J6RCWPZceFnlEdS8jiRgmIk9FPaY

If you want to use this notebook directly, you should upload your raw file to the Colab environment, change the variable in_path (in one of the lowest code boxes) to the path to this file. You may also change the output file name in the last code box). Change the ratio to the desired amplification ratio. Don't forget to change the black level as explained!

Here's my result - exposure time was 1/10s, and I think the ratio was 100: sid_example

aasharma90 commented 5 years ago

Hi @noamgot

Many thanks for your reply. I've a NIkon D80 camera, and when I check the black level using the same snippet above, I get 0, and the max. value returned by np.max() varies from image to image (4041, 1389, 4095, 535, and so on). So, I am not sure how to normalize the raw images correctly.

An additional change I found is the raw_pattern matrix of this camera (which can be seen by checking raw.raw_pattern), which is [[3, 2], [0, 1]], but if you check the raw images provided in the Sony dataset folder, you can see that the matrix there is [[0, 1], [3, 2]], which means that the packing operation in pack_raw() would have to be corrected?

Also, just in case, if you could spare some time to help me out, could you please get in touch with me @ aashish.sharma@u.nus.edu. I'll be highly grateful.

aasharma90 commented 5 years ago

Dear @cchen156 @noamgot

Also, I've got some preliminary results by doing 1) Normalizing by im = np.maximum(im - raw.black_level_per_channel[0], 0) / (16383 - raw.black_level_per_channel[0]) 2) Modifying the packing operation from out = np.concatenate((im[0:H:2, 0:W:2, :], im[0:H:2, 1:W:2, :], im[1:H:2, 1:W:2, :], im[1:H:2, 0:W:2, :]), axis=2) to out = np.concatenate((im[1:H:2, 0:W:2, :], im[1:H:2, 1:W:2, :], im[0:H:2, 1:W:2, :], im[0:H:2, 0:W:2, :]), axis=2) to take care of the different raw_pattern observed for my raw images.

The results look denoised and enhanced but are quite off in color. They have some red coloration, which I'm not able to figure out why. An example below (first in the input image scaled (captured at an exposure time of 1/10s), second is the output from the network (with ratio=100))- image image

Do you know what could be the problem here? Many thanks again for your time!

noamgot commented 5 years ago

@aasharma90 your results don't look so bad, IMHO.

I'm afraid that I will not be able to assist more than I already did - I didn't investigate the code thoroughly, and I created the Colab notebook just so I can create an image to show in a presentation I did about the paper... sorry :\

aasharma90 commented 5 years ago

Dear @noamgot,

Yes, the results are OK, but I'm unsure whether they are optimal, or there is still some problem in my test code. Also, thanks again for all your help ! :) I hope @cchen156 could provide some assistance.

l0stpenguin commented 5 years ago

@noamgot Thanks for the reply, i had the same thing as you did except i used a hardcoded black level. Changing to the way you showed did not help. Here is an example of how terrible it is on some pics (used a ratio of 650 and sony model checkpoint): Raw DNG from iphone X: Screenshot 2019-07-03 at 21 00 38

Result: Screenshot 2019-07-03 at 21 05 59

Ground truth: Screenshot 2019-07-03 at 21 00 48

troyliuyue commented 5 years ago

@aasharma90

Modifying the packing operation from out = np.concatenate((im[0:H:2, 0:W:2, :], im[0:H:2, 1:W:2, :], im[1:H:2, 1:W:2, :], im[1:H:2, 0:W:2, :]), axis=2) to out = np.concatenate((im[1:H:2, 0:W:2, :], im[1:H:2, 1:W:2, :], im[0:H:2, 1:W:2, :], im[0:H:2, 0:W:2, :]), axis=2) to take care of the different raw_pattern observed for my raw images.

May I know the pattern of yours?

Thanks!

aasharma90 commented 5 years ago

Hi @troyliuyue ,

This is what I observe for my data (captured using Nikon D80) - image

This is what I observe for the available RAW files from the Sony dataset - image

Hence, I had to pack raw in a different format to get the above shown results for my data. Hope it helps!

troyliuyue commented 5 years ago

Hi @troyliuyue ,

This is what I observe for my data (captured using Nikon D80) - image

This is what I observe for the available RAW files from the Sony dataset - image

Hence, I had to pack raw in a different format to get the above shown results for my data. Hope it helps!

Thanks for your information and help. I really appreciate it.

KashyapCKotak commented 5 years ago

@aasharma90 @troyliuyue Can you please explain the output of raw_pattern to me please? I am somehow not able to make out what it signifies and how the pack function was changed due to this. Tried to search online but didn't find any info.

aasharma90 commented 5 years ago

Hi @KashyapCKotak ,

As per my understanding, the pattern [[0, 1]; [3, 2]] denotes [[R, G]; [G, B]] pattern , ({0,1,2} for {R,G,B} and 3 for the extra G channel). So, the packing operation is

R = im[0:H:2, 0:W:2, :] # Every alternating value starting from position (0,0) is red
G = im[0:H:2, 1:W:2, :] # Every alternating value starting from position (0,1) is green
B = im[1:H:2, 1:W:2, :] # Every alternating value starting from position (1,1) is blue
G_e = im[1:H:2, 0:W:2, :] # Every alternating value starting from position (1,0) is green extra
out = np.concatenate((R, G, B, G_e), axis=2) # Always in R-G-B-G format

You can see that if the pattern is [[3, 2]; [0, 1]], it means that it denotes [[G, B]; [R, G]], so you just need to change the starting positions to pack them. For instance, for red, you now need to do

R = im[1:H:2, 0:W:2, :] # Every alternating raw value starting from position (1,0) is red

and similarly for the other channels. Remember that the method only accepts inputs in R-G-B-G format, so you will always need to do the concatenation operation as shown above.

KashyapCKotak commented 5 years ago

Great Explanation! Thanks. Had one more question. While taking the image patch, is there any significance of the number 512? Or can it be anything?

aasharma90 commented 5 years ago

Nothing specific with 512, but we need to ensure that the patch size is a power of 2, so there is no ambiguity during downsampling within the network (imagine if you have to downsample a tensor of 5x5, there is a problem in deciding the spatial size of the output). The min. patch size is also decided by the number of downsampling operations, for instance, if you have let's say 4 such operations within your network, then you will need atleast a patch of 32x32 (2^5), so that the smallest tensor is atleast 2x2. Other considerations can be your application of the network, memory size, etc.

KashyapCKotak commented 5 years ago

One more question.. what's up with the number 16383? Is it due to the assumption that its a 14 bit image? I see in another issue: https://github.com/cchen156/Learning-to-See-in-the-Dark/issues/73 where the max value is taken instead of hardcoding. Don't you think that the max value in the image may not always be what 14 bit image can have to be useful while normalising? How do you programmatically retrieve this info from the image?

aasharma90 commented 5 years ago

The basic idea is to normalize the RAW values from their [min, max] to [0, 1]. The min. would be decided by the black level of the camera. You can easily check the black level of your raw image (or camera) by - image

The max. is usually decided by the number of bits per channel, which is usually 12-bit (so max=2^12-1=4095) or 14-bit (max=2^14-1=16383). I think you should have this information before hand, or as suggested in #73, you can probe it from the max. value, but be advised that it may vary with the image, and may not always give you the true max. value. So, you may want to check it for many raw images from the same camera (or just an image taken in very bright conditions so that your raw values nearly saturate). You can also try finding that information from the EXIF tags, but I'm not sure of the correct tag to check there.

hircoder commented 4 years ago

@gaseosaluz Could you please share your code?

I tried to find a way around running inference without GT, but the quality of results are not good. Much appreciated.

gaseosaluz commented 4 years ago

Sorry for delay in responding. I have not worked on this project in a while.

I will try to get to this and share the code in a separate repository. Currently I cannot do it because it is part of other work that cannot be shared. I will post a note here when I do. Unfortunately I do not have an ETA for thisk.

On Feb 7, 2020, at 12:07 AM, Hossein I. Rad notifications@github.com wrote:

@gaseosaluz https://github.com/gaseosaluz Could you please share your code?

I tried to find a way around running inference without GT, but the quality of results are not good. Much appreciated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cchen156/Learning-to-See-in-the-Dark/issues/75?email_source=notifications&email_token=ALS3Q62ZSUFCOVYCOAYXMUTRBT3CZA5CNFSM4HDHRDSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELB2XGQ#issuecomment-583248794, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALS3Q67BMHKVKWXX6F2EOJ3RBT3CZANCNFSM4HDHRDSA.

Rokazas commented 4 years ago

How do You determine cameras Sensor type (Bayer sensor or not) and its patterns?

Sanket-15 commented 4 years ago

@aasharma90 I just had a small doubt. Do we need to have ground truth images even for testing the model? or it is enough to just have the short exposure image ?

Would really appreciate your feedback.

aasharma90 commented 4 years ago

Hi @Sanket-15,

For testing, you only need the short exposure RAW image. There is an optional setting that you can do to control how much intensity amplification you need (the default value is 100 if I recall correctly). You only require the long exposure RGB/JPEG image if you wish to see/evaluate how good the predicted result is.

Sanket-15 commented 4 years ago

Hi @Sanket-15,

For testing, you only need the short exposure RAW image. There is an optional setting that you can do to control how much intensity amplification you need (the default value is 100 if I recall correctly). You only require the long exposure RGB/JPEG image if you wish to see/evaluate how good the predicted result is.

@aasharma90 Thank you for the reply. I had this doubt because the input of test_Sony.py is both short and long exposure images. Also, in the code I could find many references to gt. For example: "gt_dir = './dataset/Sony/long/'".

What I want to do is test my own set of raw short exposure images without using the long exposure images. Can you please tell me about the optional setting? Also, can I delete all the "gt" references and directly multiply the raw input images with an amplification factor. Will this solution work?

aasharma90 commented 4 years ago

Hi @Sanket-15,

I'd be happy to help you, but could you please instead contact me directly at aashish.sharma@u.nus.edu, so that we don't unintentionally spam this thread? Thanks!