GuidoBartoli / sherloq

An open-source digital image forensic toolset
GNU General Public License v3.0
2.6k stars 238 forks source link

no_auto_scale=True can cause significant clipping and data loss #99

Closed atadams closed 1 month ago

atadams commented 1 month ago

In v0.89f, adding the no_auto_scale=True parameter to the raw.postprocess in the load_image function (utility.py) can cause significant clipping and data loss when opening a RAW file. This is illustrated by view the histogram of this real-world, naturally taken gallery image from Imaging Resource.

Red and blue histogram with v0.89f no_auto_scale=True parameter

issue-noscaling

You can see a large percentage of red and blue pixels have 0 intensity and the image has a strong green tint. I believe the clipping is happening because the black levels are still being subtracted. With other images, the R, G, or B channels can be entirely 0 values.

Alex Tutubalin, the developer of Libraw, has stated, “noauto scale is very special use parameter.”

libraw-01

You can see the results of removing the no_auto_scale=True parameter in the histograms below.

Red and blue histogram with no_auto_scale=True parameter removed

issue-scaling

The image’s color is more balanced and you can see the number of pixels with 0 intensity is significantly reduced.

Recommendation

I recommend removing the no_auto_scale=True parameter from the load_image function in utility.py.

Thank you.

Ray9T commented 1 month ago

@atadams I think your use of term "Data loss" is very misleading and your lack of understanding is concerning.

The data is there, just not scaling, unlike a typical image viewer or image beautifying app that performs scaling (based on LUT) to fit your monitor recommendation.

Meaning you'reseeing the data as the sensor received the photons and intensity ratios with no_auto_scale=True If you adjust the brightness and other params, the image starts to appear. All the pixels are still there, you are confusing between "Intensity" of the pixels and "Number " of pixels Tony. I hope you realize how fundamental your misunderstanding is.

Please understand the difference between what a sensor sees and what a monitor displays, and how the process flow looks like. For VFX artists, the knowledge of reverse flow is important, else you'll produce fake looking images with banding in Histograms.


Anomalies in images isEXACTLY what Sherloq like apps will FLAG. For example,

  1. Histogram showing unnatural color banding, or set within a narrow intensity range with gaps in RGB, which is near impossible in real world
  2. Images requiring specific gamma settings is a suspect. One should evaluate such findings against possible images made ON monitors with fixed settings, like Matte painting that's common among VFX people, which is made from sections of natural images).

Flagging VFX RAW images disguised as natural images to combat disinformation is a critical role in the field of forensics. Do you agree? If you do, then please consider not muddying the waters with false requests, as your requests will make it impossible to evaluate images that spread disinformation via synthetic creation process (VFX)


To anyone reading, and not trying to be rude, I can suggest few good books or articles on histograms, image processing if you need, please ask.

image

For VFX artists who make images on monitors and push it into a RAW file, knowing the process is extremely important.

This is expected behavior of reading an unprocessed RAW data. Please focus on your objective andnot specific settings, as your fundamentals of RAW and image processing is flawed, and it's best if you ask what you need instead of invalid suggestions.

If you need your specific image discussed, im happy to break it down for you based on latest Sherloq app behavior, which is doing it's job as a forensic app and not an image viewer.

Regarding image

Yes, auto scaling is indeed a special case where forensics is the right home. For regular image viewing i'd not recommend this setting. but for Forensics, this is 100% valid setting. Just because someone said the setting is "special" doesn't make it Invalid. Right?

Again, your image Data is there and not lost my friend.

JohnConnor01 commented 1 month ago

@GuidoBartoli I am happy to provide corroborated results from another application. When it comes to the Red Channel and the Camera Sensor Data (and the camera sensor in general) with the 19 photos referenced in the previous issue. Tony presented one of these 19 photos in the previous issue. All 19 photos have specific readings in certain areas on the current version of Sherloq (your recommendation/compromise).

We can show this is an issue with the photos. And not the application. We have overwhelming evidence at this point.

If you would prefer us share this information publicly. Please let us know we can begin putting everything together. I would just ask for 48-72 hours from the time of notification as it will take some time to put everything together.

Link to 19 photos associated with Tony and many of his colleagues. The results of these 19 photos on the current version of Sherloq CAN be corroborated on a different application. Along with significantly more information with camera sensors on all 19 photos. https://drive.google.com/file/d/1JT0KOI1yJEtZVzdQtVBHWzyKujFDlBrb/view

TJPofTexas commented 1 month ago

JohnConnor01, are you also BobbyoO_ on Twitter / X?

"We have overwhelming evidence at this point."

I know you may be overwhelmed because your primary experience is trading memecoins, but Canon Digital Photo Professional 4.0 is the only manufacturer created utility for processing CR2 raw files and the results when processing the RAW with the Canon software, (or with just about any other raw software, rawdigger, darktable, etc) it's plain to see that the processing method in the current version of Sherloq is not properly debayering and scaling the non-green channels.

image image

Ray9T commented 1 month ago

@GuidoBartoli I am happy to provide corroborated results from another application. When it comes to the Red Channel and the Camera Sensors with the 19 photos referenced in the previous issue. Showing this is an issue with the photos. And not the application. We have overwhelming evidence at this point.

If you would prefer us share this information publicly. Please let us know we can begin putting everything together. I would just ask for 48-72 hours from the time of notification as it will take some time to put everything together.

Indeed, we need to start understanding how some images are unique/anomalous and STOP pushing for changes in the app until the problem goes away. In fact, just based on image fundamentals, requests from Tony is very concerning, as anyone with image processing know how flawed his statements are.

In the case of Tony, he does not understand the basics of histogram and how it applies to images. Tony's issue has NO merit anyway you look at it. He is essentially asking for scaling in the guise of "data loss", which is not the case at all. Feels like a repeat of his last flawed request. He just wants his images to show up as they were made on their "monitors" and finding loopholes to get their results at the expense of the app's integrity.

The pixels are all there, the data is there, but without scaling the raw data is showing linear ratios, which is what the camera sensor registered. I'd love to have this function in any forensic app. The issue is clearly with images and maybe it's better to review the images and not blame the software.

Ray9T commented 1 month ago

JohnConnor01, are you also BobbyoO_ on Twitter / X?

"We have overwhelming evidence at this point."

I know you may be overwhelmed because your primary experience is trading memecoins, but Canon Digital Photo Professional 4.0 is the only manufacturer created utility for processing CR2 raw files and the results when processing the RAW with the Canon software, (or with just about any other raw software, rawdigger, darktable, etc) it's plain to see that the processing method in the current version of Sherloq is not properly debayering and scaling the non-green channels.

image image

This image shared is already applying postprocess scaling to make the image viewable, and unlike Sherloq that has a very different purpose and reads the raw files "close to unprocessed", which is VERY CRUCIAL in forensics.

If you are after image viewer, may I suggest Raw digger, Raw therapee, DarkTable, Canon DPP, Adobe products, or just basic 1party image viewers or irfanview.

In summary, these inquiries reveal deficiencies in the foundational principles of image processing, and interestingly coming from the same group (looks like) and dare I say, misleading the forensics app to turn it into yet another image viewer.

Request. Sherloq by showing data in the most transparent form (version = 0.89f) will help expose some image abnormalities, and one must fully understand the nature of ask, the images in question before arriving at any conclusions.

TJPofTexas commented 1 month ago

To you, poor fundamentals is opening a raw file in the factory, standard RAW file ingesting software.

Sounds like you're trying to perpetrate some kind of fraud relative to these photos Jonas took.

image

TJPofTexas commented 1 month ago

So glad that Ray9T has made this graphic to show they have no idea how color scale settings can result in leaving data out of range, and thus omitted, on conversion from CR2 RAW to other image formats.

image

atadams commented 1 month ago

Link to 19 photos associated with Tony and many of his colleagues. The results of these 19 photos on the current version of Sherloq CAN be corroborated on a different application. Along with significantly more information with camera sensors on all 19 photos.

Aren’t you claiming the images have no red channel?

It is true that when those images are opened in the current version of Sherloq, the red channels are only zero values. This is explained by my opening comment in this issue. That does not mean the red channel doesn’t exist, only that the red channel is being “zeroed out” when the black levels are subtracted.

I welcome Guido to look and any or all of those images as an illustration of this issue I described above.

atadams commented 1 month ago

In the case of Tony, he does not understand the basics of histogram and how it applies to images.

This is rude and uncalled for.

You've made several comments about me that are really out of line.

JohnConnor01 commented 1 month ago

Link to 19 photos associated with Tony and many of his colleagues. The results of these 19 photos on the current version of Sherloq CAN be corroborated on a different application. Along with significantly more information with camera sensors on all 19 photos.

Aren’t you claiming the images have no red channel?

It is true that when those images are opened in the current version of Sherloq, the red channels are only zero values. This is explained by my opening comment in this issue. That does not mean the red channel doesn’t exist, only that the red channel is being “zeroed out” when the black levels are subtracted.

I welcome Guido to look and any or all of those images as an illustration of this issue I described above.

Hello

Please re-read my comment. The results of the current version of Sherloq (Gudio’s recommendation/compromise) with the 19 photos linked. Can be corroborated using another application. Ontop of that - significantly more information can be shown on the camera sensors on all 19 photos. Showing how this is an issue with the 19 photos - and not the application.

I am simply asking for notice of how that information would like to be released. As it would provide supporting evidence that this version of Sherloq is not reading those 19 photos in error.

There are distinct differences when analyzing these specific 19 RAW photos. And other RAW photos from Imagine Resource. And we are happy to illustrate that. And/or provide any/all additional supporting evidence.

All RGB intensity values have been pushed to the left with post processing turned down to a minimum. That should have been expected. And turning auto scale back on (or using rawpy default) and the RGB intensity values increasing - should also be expected.

The compromise was intended to avoid bias results. As stated on the previous issue. This is a sample size of 1 or 2 you are presenting while asking for wholesale changes again.

But there are distinct differences with the 19 RAW photos linked. And many other RAW photos we have sampled. Those primarily show up in the Red Channel and/or Red mean vector PCA value.

My position is the current version of Sherloq is not reading those 19 photos in error. And I am happy to show the corroborated results with another application. Plus overwhelmingly more evidence of/relating to the camera sensors on all 19 photos.

atadams commented 1 month ago

My position is the current version of Sherloq is not reading those 19 photos in error. And I am happy to show the corroborated results with another application. Plus overwhelmingly more evidence of/relating to the camera sensors on all 19 photos.

This issue happens with every RAW file opened — including the 19 you reference.

JohnConnor01 commented 1 month ago

@GuidoBartoli

@atadams has raised another issue about clipping with RAW photos on the current version. The RGB intensity values are not getting scaled and showing linear proportions. Which is what you would expect reading the camera censor data. I would argue he is conflating two different things. This was supposed to be the version that avoids biased results. And we greatly appreciate you reaching that compromise.

However if his recommendation is accepted and implemented. That will BRING BACK clipping with RAW photos. As highlighted below with two CR2 RAW photos from Imaging Resource. I will go back and find the link to the original photo so this can be replicated. Tony's recommendation will bring back the very problem he said he was trying to solve. It appears there is big disconnect between what Tony says he wants, and the specific changes he is asking for.

Example 1 of RAW Photo being clipped after no_auto_scale=true is removed Link to Image: https://www.imaging-resource.com/PRODS/canon-7d-mark-ii/Y071A1955-proto.CR2.HTM

image

Example 2 of RAW Photo being clipped after no_auto_scale=true is removed Link to Image: https://www.imaging-resource.com/PRODS/canon-6d-mark-ii/Y_MG_1068.CR2.HTM

image

BakersTutz commented 1 month ago

@JohnConnor01 looks like those images were already clipped in camera. Both of those images contain direct sunlight reflecting off the water. If the highlights are already clipped, Sherloq isn't going to unclip it.

Ray9T commented 1 month ago

Link to 19 photos associated with Tony and many of his colleagues. The results of these 19 photos on the current version of Sherloq CAN be corroborated on a different application. Along with significantly more information with camera sensors on all 19 photos.

Aren’t you claiming the images have no red channel?

It is true that when those images are opened in the current version of Sherloq, the red channels are only zero values. This is explained by my opening comment in this issue. That does not mean the red channel doesn’t exist, only that the red channel is being “zeroed out” when the black levels are subtracted.

I welcome Guido to look and any or all of those images as an illustration of this issue I described above.

The right way to evaluate this is: -Stop making changes to the Sherloq. More so now considering we are as close to raw data as possible. -Understand why Tony's 19 images are so anomalous.

atadams commented 1 month ago

That will BRING BACK clipping with RAW photos.

The two examples you provided appear underexposed or overexposed. As Guido pointed out in a previous issue, there is nothing Sherloq should do about that.

Ray9T commented 1 month ago

@JohnConnor01 looks like those images were already clipped in camera. Both of those images contain direct sunlight reflecting off the water. If the highlights are already clipped, Sherloq isn't going to unclip it.

Argument can be made that Tony's 19 images are already anomalous, and the sensor data is revealing the issues.

Anyway, the pretense that there isdata loss when in reality it's not scaling shows poor fundamental understanding of RAW file concepts.

atadams commented 1 month ago

It appears there is big disconnect between what Tony says he wants, and the specific changes he is asking for.

Also, could you please refrain from personal comments about me and just address the technical issues?

Ray9T commented 1 month ago

It appears there is big disconnect between what Tony says he wants, and the specific changes he is asking for.

Also, could you please refrain from personal comments about me and just address the technical issues?

You're misreading the statement. John said there's a disconnect between what's being asked, the rationale, and the specific settings you want to introduce to make the assumed problem go away.

Do you realize there is no data loss to begin with or is that still not clear to you?

atadams commented 1 month ago

Understand why Tony's 19 images are so anomalous.

They aren't my images, but Guido is obviously free to use them in his investigation. The fact that, when opened in Sherloq, the red channels of those images are comprised entirely of zero values is an excellent example of the issue I'm describing.

JohnConnor01 commented 1 month ago

Test Photo 1 https://www.imaging-resource.com/PRODS/canon-7d-mark-ii/Y071A1955-proto.CR2.HTM

Test Photo 2 https://www.imaging-resource.com/PRODS/canon-6d-mark-ii/Y_MG_1068.CR2.HTM

Test Photo 1 on Current Version (No Clipping) image

Test Photo 1 after no_auto_scale=true is REMOVED (Clipping) image

Test Photo 2 on Current Version (No Clipping) image

Test Photo 2 after no_auto_scale=true is REMOVED (Clipping) image

If the issue is: Clipping with RAW Photos. It appears Tony's recommendation will only increase the frequency at which RAW Photos are clipped while using Sherloq. The compromise again was intended to avoid bias results.

JohnConnor01 commented 1 month ago

Understand why Tony's 19 images are so anomalous.

They aren't my images, but Guido is obviously free to use them in his investigation. The fact that, when opened in Sherloq, the red channels of those images are comprised entirely of zero values is an excellent example of the issue I'm describing.

"the red channels of those images are comprised entirely of zero values is an excellent example of the issue I'm describing"

Tony, have you checked other sources to see if they corroborate these results? Because we have. And other sources corroborate these very results.

Ray9T commented 1 month ago

They aren't my images, but Guido is obviously free to use them in his investigation. The fact that, when opened in Sherloq, the red channels of those images are comprised entirely of zero values is an excellent example of the issue I'm describing.

Good, then respectfully, let's investigate and not rule images out as a problem. Blaming the software to cause data loss is getting old and shows lack of fundamental understanding of RAW file processing.

atadams commented 1 month ago

Tony, have you checked other sources to see if they corroborate these results? Because we have. And other sources corroborate these very results.

Yes, I have checked other sources. The 19 images you refer to all have red channels in every imaging app I've tried. To claim they don't is not reasonable.

If you have information that will help Guido make a decision, I suggest you provide it.

atadams commented 1 month ago

Good, then respectfully, let's investigate and not rule images out as a problem. Blaming the software to cause data loss is getting old and shows lack of fundamental understanding of RAW file processing.

I'll ask again that you stop with the personal attacks.

atadams commented 1 month ago

If the issue is: Clipping with RAW Photos. It appears Tony's recommendation will only increase the frequency at which RAW Photos are clipped while using Sherloq. The compromise again was intended to avoid bias results.

Again, those images are underexposed or overexposed. The clipping you are showing isn't due to Sherloq.

Guido mentioned this in the issue you opened previously and asked that examples like this not be provided anymore.

JohnConnor01 commented 1 month ago

Tony, have you checked other sources to see if they corroborate these results? Because we have. And other sources corroborate these very results.

Yes, I have checked other sources. The 19 images you refer to all have red channels in every imaging app I've tried. To claim they don't is not reasonable.

If you have information that will help Guido make a decision, I suggest you provide it.

I am happy to provide corroborated results. Given the repeated requests to change software settings and applications in recent history. I am waiting for @GuidoBartoli response on how to publish that information. I can provide it publicly or privately.

Separately. I highlighted how your recommendation will only bring back the very problem (clipping with RAW photos) that you say you are trying to solve.

Those photos are not being clipped on the current version. But they will be when you apply your recommendation.

atadams commented 1 month ago

Those photos are not being clipped on the current version. But they will be when you apply your recommendation.

They were clipped (I.e., underexposed or overexposed) when the photo was taken. My recommendation would not change that.

Ray9T commented 1 month ago

Good, then respectfully, let's investigate and not rule images out as a problem. Blaming the software to cause data loss is getting old and shows lack of fundamental understanding of RAW file processing.

I'll ask again that you stop with the personal attacks.

Where's personal attack in pointing out the fundamental fallacy. Your title says "Data loss" and "clipping", and then you ask for removal of Scaling. Do you realize there is

  1. No Data loss
  2. Scaling does not bring any data back miraculously. The data is already there in Sherloq when its calling raw.postprocess, scaling is not applied as an agreed up on principle to provide DATA as close to RAW as possible. Again, No scaling !=Data loss, and there is no personal attack in this.

Contrarily, I anticipated a thank you for identifying the fallacy, which I believed would save readers a significant amount of time. This is a forensic app, and i hope people respect this special purpose application and not turn it into a generic image viewer.

atadams commented 1 month ago

Where's personal attack in pointing out the fundamental fallacy.

When you say the issue is my lack of fundamental understanding, it's personal. And I'll ask you again to stop.

Ray9T commented 1 month ago

Where's personal attack in pointing out the fundamental fallacy.

When you say the issue is my lack of fundamental understanding, it's personal. And I'll ask you again to stop.

I understand your perspective, but my intention is simply to show that your concern is not a valid argument. If you share your images, i may be able to help you understand if they are real images or VFX. VFX images often have issues you are pointing out and need specific settings to view them as painted. I'm happy to investigate them for you.

We can take the discussion/investigation to my repo to avoid digressions.

atadams commented 1 month ago

If you share your images,

Again. This issue isn't about any specific images. It happens with all images.

atadams commented 1 month ago

i may be able to help you understand if they are real images or VFX.

It's not appropriate to be soliciting business on GitHub issues. It might be considered spam.

Ray9T commented 1 month ago

If you share your images,

Again. This issue isn't about any specific images. It happens with all images.

The issue you brought up is a non-issue as I pointed out previously, This only highlights, if I may, a fundamental misunderstanding. I'm at a loss for how else to express it.

Whenever I point out a technical misunderstanding in your post, you interpret it as a personal attack, which it is not. Hopefully you see read this in a positive light.

It's not appropriate to be soliciting business on GitHub issues. It might be considered spam.

You misunderstood and allow me to rephrase. I meant to avoid spam, i can break down your images for potential anomalies instead of shaping Sherloq to render your image exactly as you desired.

Use any other image viewer if you dislike gamma (1,1) / no scaling. Is that not an option for you?

atadams commented 1 month ago

You misunderstood and allow me to rephrase. I meant to avoid spam, i can break down your images for potential anomalies instead of shaping Sherloq to render your image exactly as you desired.

Your assumption is I need your help. I don't. Please refrain from any mention of my abilities or skills and stick to the issue.

GuidoBartoli commented 1 month ago

Guys... every time we make a change within the RAW file import a ruckus happens here, you really make me regret leaving this feature inside Sherloq...

Let's make this the last adjustment on this, shall we?

In 4f89cfa6da60f24e2c18ab6a658dcca98f850362 commit, the postprocess() function is called with no_auto_bright=True and use_camera_wb=True.

Again, please do not spam and pull more discussions out of the hat for their own sake.

Ray9T commented 1 month ago

@GuidoBartoli this is strange, and you accepted that there is data loss with no scaling = true. Can you educate me on how this is a fact? If not, you just turned Sherloq into a regular image viewer. Is this what you really want? considering you said this is your last change!

As an open-source app owner, you want to see both sides, maintain the integrity of the application and changes based on sound technicals.

With scaling on you invariably introduced noise, so can you please introduce noise removal tools please?

BakersTutz commented 1 month ago

Since Sherloq's inception, "no_auto_scale=true" has never been enabled. It was only last week when it was added. He's just going back to the way it's always been, now without auto changing the white balance or brightness of the image. Sounds like he's leaving the image "as shot" from the camera. Sounds good to me.

Ray9T commented 1 month ago

Since Sherloq's inception, "no_auto_scale=true" has never been enabled. It was only last week when it was added. He's just going back to the way it's always been, now without auto changing the white balance or brightness of the image. Sounds like he's leaving the image "as shot" from the camera. Sounds good to me.

You're advocating for version without Tony's changes good!, then we should go back to (auto_bright=True) and leave everything out!

GuidoBartoli commented 1 month ago

I simply removed the no_auto_scale option, because I didn't realize it would cause the problems @atadams pointed out instead (thanks for reporting that).

Since Sherloq's inception, "no_auto_scale=true" has never been enabled. It was only last week when it was added. He's just going back to the way it's always been, now without auto changing the white balance or brightness of the image. Sounds like he's leaving the image "as shot" from the camera. Sounds good to me.

Yes, that's the point.

As already explained here, I would again point out that the option to load a RAW file within a forensic analysis program really doesn't make much sense (I don't think even commercial products like Authenticate have it either).