Closed bz2 closed 3 years ago
The difference is from floating-point inaccuracy. Various transformed representations of the image have to sum back to exactly 1.0, and floats don't do that.
The issue is the inaccuracy magnitude seems very close to the quality signal values.
For example: original | j95 | size |
---|---|---|
0.001938 | 0.002202 | 4096x4096 |
0.000388 | 0.001070 | 256x256 |
The original image at high resolution has nearly twice the value as a lossy-compressed image at low resolution? It's true jpeg at that quality is pretty hard to actually distinguish, but this makes the metric awkward to use for my purposes unless it's clearer how to interpret the delta in values when both compression method and resolution of inputs vary.
@kornelski I realise this issue is not very actionable as stands. In an effort to change that, but accepting that my understanding of your changes is limited, are any of the following three approaches workable?
Some of the code in dssim.rs
uses f64
but most uses f32
- would making the core parts double precision make the total error small enough to round out at the end?
Seems the C++ implementation has:
// default settings
double C1 = 6.5025, C2 = 58.5225;
Whereas dssim.rs
fn compare_scale
has:
let c1 = 0.01 * 0.01;
let c2 = 0.03 * 0.03;
This probably means smaller absolute values for the difference signal?
Would it be possible to round individual values to a reasonable precision before doing (simplified) (1 + EPSILON) ** HUGE
and expanding the error?
C++ uses values in 0..255 range, and I use values in 0..1 range.
sqrt(6.5025)/255 = 0.01 sqrt(58.5225)/255 = 0.03
Check out: 6eff7fb2065e226487104c2ca0dbcc5196755c5f
I've added double precision in lots of places
@kornelski Thanks for giving it a shot! I did build locally and saw it had... no effect at all, which made me wonder if there's something else going on?
This evening I had a crack at a (dumb) (big hammer) extension of your change, and just changed everything I could find to f64
to see if I could get an effect. See https://github.com/kornelski/dssim/compare/main...bz2:scratch_prec (not to be landed). Results are...mixed? The changes do result in a different final output number, but only order of 1e-05
and the variation appears to often be around that size, even when input size is creating error values of up to 1e-03
. So, I'm still wondering basically.
Are you on macOS? On macOS I use Apple's accelerate framework to perform blur. Maybe that is imprecise?
Not on macOS, so not using the ffi code. My platform:
$ rustc -V -v
rustc 1.44.1 (c7087fe00 2020-06-17)
binary: rustc
commit-hash: c7087fe00d2ba919df1d813c040a5d47e43b0fe7
commit-date: 2020-06-17
host: x86_64-unknown-linux-gnu
release: 1.44.1
LLVM version: 9.0
Simple steps to reproduce, creating a trivial flat colour input image using imagemagick:
convert -size 16x16 xc:white white16.png
cargo build --release
target/release/dssim white16.png white16.png
Seems like the colour of the 16x16 image has a big effect!
$ for f in *.png; do target/release/dssim $f $f; done
0.00117063 black16.png
0.00184506 blue16.png
0.00336392 cyan16.png
0.00504601 gold16.png
0.00225988 green16.png
0.00415885 lime16.png
0.00451364 red16.png
0.00400376 white16.png
0.00540112 yellow16.png
A much bigger, less uniform jpg digital camera image has a smaller difference with itself.
$ target/release/dssim DSCF0048.JPG DSCF0048.JPG
0.00030001 DSCF0048.JPG
Does it differ less if you make the image larger? I wonder if there's maybe a bug in blurring at the edges.
No, while with the scaled vector images I was working with before, more pixels -> greater error, with flat colour, area and ratio have no apparent effect:
0.00540112 yellow1.png
0.00540112 yellow4096r.png
0.00540112 yellow4096x.png
0.00540112 yellow4096y.png
0.00540112 yellow8.png
0.00540112 yellow999.png
The consistency is reassuring. At least it's not a random bug :)
The documentation states "The value returned is 1/SSIM-1, where 0 means identical image". However, comparing with an identical image gives a non-zero result, and the absolute value varies by quite a bit depending on the image image, by enough of a factor that high quality compression is closer to the compared input value that that is to zero.
As different metrics (and different compression methods) these examples below can't be very usefully compared. However, note that the two other metrics give a 0 result (and butteraugli a black heatmap) when comparing against the input image.
As the difference images are not black-backed either (they have an edge-detected background of some kind?) this makes comparing outputs of different compression methods less useful than if the metric was either cleaner or perhaps better documented?
Thanks for all your work on this project, has been very useful today for examining different compression options.