Closed mmmtok closed 8 years ago
Slower but more precise .. isn't .. (srcValue+4) << 8 .. both fast and precise ?
I don't think so. I think the requirements here are,
The formula converts 0 to 1024 and 255 to 66304.
In my little experience in this kind of cases, however I write the formula, compilers generate hyper maniac optimized codes, far beyond my expectations. My preference is writing simple and say "Hey Optimizer, it's your turn!"
@mmmtok .. I agree ;)
inline void ImageDatas::convertTo<unsigned short, unsigned char> (const unsigned short srcValue, unsigned char &dstValue)
{
- dstValue = (unsigned char)(srcValue >> 8);
+ dstValue = (unsigned char)(srcValue / 257 );
}
Two more things come to my mind:
const
on srcValue
is needless. Haven't looked at the base template signature, though.@mmmtok Okay, I've commited your fix to a new branch for everyone interested to try.
@heckflosse Ingo, there are other places converting from 8 to 16 bits and vice versa, especially in the Image*
classes. Shouldn't they be treated the same way?
@Floessie Yes, I would think so.
@mmmtok @iliasg I was curious and had a look at the assembly. GCC is really smart (even at -O0
) and converts
unsigned char res = val / 257;
to
unsigned char res = static_cast<unsigned long>(val * 0xFF01) >> 24;
So no division involved, which will also make @heckflosse happy. :)
I thank my good habit of not spending time to lose the battle of wits against the compiler:stuck_out_tongue_winking_eye:
I'm generally with you. On the other hand, @heckflosse shows with every commit that caring about those details still matters. :)
@Floessie I'm not sure. Have a look at the following example
uint16_t histo16[65536] (1);
uint8_t histo8[256](0);
for(int i=0;i<65536;++i)
histo8[i>>8] += histo16[i]
That results in histo8 having the value 256 for all entries, good.
uint16_t histo16[65536] (1);
uint8_t histo8[256](0);
for(int i=0;i<65536;++i)
histo8[i/257] += histo16[i]
That results in histo8 having 257 for the entries from 0 to 254 and 1 for the entry 255, bad.
That leads me also to think about whether what we're doing here is correct at all....
@heckflosse Hm, you mean the distribution?!
@heckflosse A (maybe better) solution could be replicating the 8b value in the lower byte like so: Edit: I'm a moron. That's equivalent to std::uint16_t res = static_cast<std::uint16_t>(in) << 8 | in
.* 257
. Sorry for the noise.
@Floessie yes, I mean the distribution. Still thinking about it...
@heckflosse I think the multiplication by 257 to extend to 16b is right. But the inverse is wrong: We must take the upper 8b as they are, otherwise we shift the conversion offset, as 0xFFFE / 257 is 0xFE like you've thankfully found out.
@Floessie While when starting with a jpeg and writing an 8-bit file probably there's not so much difference, when starting with a raw file and writing an 8-bit file the calculation / 257 should darken the result compared to current version and the last stop is greatly under-represented.
@iliasg @Desmis @Beep6581 What's your opinion about this issue?
@Floessie you've been faster :+1:
@Floessie Multiply by 257 and divide by 256 is also not an option. Assume you load an 8-bit tiff, use neutral profile and save to 8-bit tiff. It will get brighter then, which should not be the case.
After thinking about it, I came to the conclusion that using 257 is correct for both directions. Explanation: Assume you work in [0;1] instead of [0;65535]. Then multiplication/division using 255 would be correct to get into the [0;255] range. So using 65535/255 = 257 is correct to get from [0;65535] to [0;255].
The only problem I see is that changing this is not backwards compatible.
Edit. Compressing the raw histogram from 16-bit to 8-bit is a different case imho
@heckflosse
Multiply by 257 and divide by 256 is also not an option. Assume you load an 8-bit tiff, use neutral profile and save to 8-bit tiff. It will get brighter then, which should not be the case.
Well, I know your are usually right and I had a glass of wine, but: Are you sure about this one? Take an arbitrary hex value, e.g. 0xDE, multiply by 257 --> 0xDEDE, shift right by 8 --> 0xDE. I must be missing something. Point is: The multiplication with 257 is just for evenly filling the lower 8 bits. Division must be truncating.
So using 65535/255 = 257 is correct to get from [0;65535] to [0;255].
True for floats, but right for integer? I mean, 0xFFFE is much nearer to 0xFFFF than to 0xFE00.
Still thinking...
@Floessie
True for floats, but right for integer?
That's what I also still think about ...
@Floessie How about this solution?:
Convert from 8 to 16 bit using * 256 Convert from 16 to 8 bit using round instead of truncate/floor (though that's a bit slower). Show rounded values in Navigator when using the [0;255] range for showing the values.
@mmmtok
[Reproduction procedure]
- Prepare pure white jpeg file whose pixel values are all 255.
- Open the file.
- See Navigator values. R, G, B are 99.6% V of HSV is 99.6% L of Lab is 99.7%
I think all should be 100%.
Do the same but using 'no profile' as 'input profile' leads to the result you expected btw. Though I currently don't know why...
.. it looks like 99.6% comes from wrong top value for 8bit range ;) (255/256=0.996) But then, I don't understand why 65535/257 gives different result than 65535 << 8 .. both should give 255 i.e. 99.6% .. no ? @heckflosse Ingo, could it be due to inaccurate integer arithmetics in Lcms ? I have detected some missmatches when using Rt_sRGB (Lcms) vs "no icm .. assume sRGB" (no Lcms) as output profile ..
I'll have a look today afternoon :)
@heckflosse
Convert from 8 to 16 bit using * 256
Same old problem: Max value is 0xFF00 and not 0xFFFF.
Convert from 16 to 8 bit using round instead of truncate/floor (though that's a bit slower).
Testbed:
#include <iostream>
#include <vector>
#include <cmath>
int main(int argc, char** argv)
{
const std::vector<int> histo16(65536, 1);
std::vector<int> histo8(256);
for (unsigned int i = 0; i < 65536; ++i) {
histo8[std::lround(i / 257.f)] += histo16[i];
}
for (unsigned int i = 0; i < 256; ++i) {
std::cout << i << " = " << histo8[i] << std::endl;
}
return 0;
}
Gives 129 on 0, 257 from 1 to 254, and 129 on 255.
@iliasg
.. it looks like 99.6% comes from wrong top value for 8bit range ;) (255/256=0.996)
No, I think it's the result of 0xFF00 / 0xFFFF
(≈ 0,99611).
I don't understand why 65535/257 gives different result than 65535 << 8 .. both should give 255 i.e. 99.6% .. no ?
They both yield the same result: 65535 / 257 = 255, 65535 >> 8 = 255, but their distributions differ:
Value | / 257 | >> 8 |
---|---|---|
0xFFFF | 0xFF | 0xFF |
0xFFFE | 0xFE | 0xFF |
0xFFFD | 0xFE | 0xFF |
0xFFFC | 0xFE | 0xFF |
... | ... | ... |
0xFF02 | 0xFE | 0xFF |
0xFF01 | 0xFE | 0xFF |
0xFF00 | 0xFE | 0xFF |
0xFEFF | 0xFE | 0xFE |
0xFEFE | 0xFE | 0xFE |
0xFEFD | 0xFD | 0xFE |
0xFEFC | 0xFD | 0xFE |
... | ... | ... |
0xFE02 | 0xFD | 0xFE |
0xFE01 | 0xFD | 0xFE |
0xFE00 | 0xFD | 0xFE |
0xFDFF | 0xFD | 0xFD |
0xFDFE | 0xFD | 0xFD |
0xFDFD | 0xFD | 0xFD |
0xFDFC | 0xFC | 0xFD |
0xFDFB | 0xFC | 0xFD |
... | ... | ... |
I'm still thinking, that 8b to 16b should be * 257
and 16b to 8b / 256
, but I'm open for other arguments. :)
HTH Flössie
@heckflosse Your rounding version, i.e. (i + 128) / 257
has a better error distribution than i / 256
. Maybe that's the way to go. The roundtrip works as expected.
round((i + 128) / 256) - 1 <= scratch that
I had a background thought when I proposed my changes. I'll be happy if you guys take time to read it.
Pixels in JPEG may have values 0xff. The value may mean real value 0xff or the pixel may be clipped. Because we cannot distinguish real values from clipped, we have to choose one of the options below. Option 1: always treat 0xff as a real value. Option 2: always regard 0xff as a clipped pixel.
Option 1: This is the current imprementation. 0xff is mapped to an actual value 0xff00. As the result, the navigator value for 0xff is 99.6%. It seems strange for me, but this is correct, because the value 0xff does not mean clipped and the percentage can be below 100%.
Option 1': There is another, may be better, way to map the value 0xff. Let's map 0xff to the mid point between 0xff00 and 0xffff. In this case, the percentage goes to 0xff80 / 0xffff * 100 = 99.8%. Still strange but it's also correct.
Option 2: I wanted to see 100% for 0xff, and I proposed the change based on the philosophy that 0xff always mean clipped. Only when the value is 0xff, it is mapped to 0xffff. In reverse conversion, only when the value is 0xffff, it is mapped to 0xff. If the 16bit value is 0xfffe, it does not mean clipped and it should not maped to 0xff, but 0xfe.
I do not persist in my initial proposal, and I respect what many people can accept.
@Floessie thanks, now I see !
I any case this pure truncation is bad and we have to adopt the (best) rounding sheme .. (i+128 ?? ..)
I think that we can tolerate some error at the highlights but errors at the very low values are critical .. values erroneously clipped to 0 are irrecoverable.
So we have to adopt the method which gives less quantization error at the darks ..
@heckflosse Ingo, what do you think about (i + 129) ;)
BTW this i + 128 overflows the integer16 limit .. now I understand why Adobe's 16bit tiffs are in fact 15bit ;)
usefull test file http://cl.ly/3F2v0T0j2x3P/RGB_255to0_52steps_Bayer.dng.zip
@iliasg
BTW this i + 128 overflows the integer16 limit
No need to be afraid: Integer promotion will handle that properly. :)
For the case we decide to round I made a test. I converted a 36 mp file full size to jpeg.
using current method : 411 ms using std::lround : 2709 ms
That's not acceptable. Fortunately there is a solution (at least for x86 arch, maybe some others too if we care about endianess)
Using
int float2int(float d)
{
d += 12582912.f;
return reinterpret_cast<int&>(d);
}
the processing time is 456 ms which is only a 10% slowdown compared to current version and is perfectly acceptable.
@heckflosse No, using floats is not an option, but (i + 128) / 257
in the integer domain yields the same result.
@Floessie Ok, maybe I was a bit off topic ;-) I just prepare a speedup for this
Edit: your std::lround(i / 257.f)
from above confused me a bit (you went to float there too). But now it's clear.
@heckflosse Ingo, sorry for the confusion. That was my first take on your proposal:
Convert from 16 to 8 bit using round instead of truncate/floor (though that's a bit slower).
Fine, that this got sorted out. There are other things to consider, alas: I think we've all read @mmmtok's reasoning for why he proposed the change. He has some valid points on how to signal or not signal clipping with 0xFF or 0xFFFF, respectively. I realize both arguments make sense, so maybe we should first reach an agreement on how to deal with this. I've no opinion, and it's absolutely okay with me, if we don't follow up that matter and keep things like they were for decades.
Best Flössie
I spoke with houz (from dt) and @Beep6581 asked in gimp channel how they convert. Both work in [0;1] range and convert using 255.
Ingo, you mean int8 = round(0_1_value * 255) .. no ?
But this issue is about navigator's pct values which should be (to_be_exported_int8/255)*100
I read all this issue!
it is a very old "bug"...about 2006... But a small one, only for JPG :)
I think, I have not tested, that the solution propose (after discussion between Ingo, Floessie and Ilias) seems good: round((i + 128) / 256) - 1, or something near ! Jacques
MagickCore (ImageMagick) does round. Interesting indeed that there is no common way of "correctly" converting 16 bit to 8 bit image data.
OK, we are fine with precision if we follow the BABL way ...
On the other hand, the rounding method ((i + 128) / 257
) has a better error distribution.
@heckflosse Ingo, what shall I do? And what about those places in code which I mentioned above and are not directly connected to write out 8 bit data? Please decide.
@Floessie In no case the raw histogram should be changed.
All other parts should have the same calculation, whatever that will be.
I didn't compare the 'error' distribution between ((i+128)/257)
and the BABL way. Are they much different?
@heckflosse The BABL way is i / 257
, so 257 values from 0 to 254 and 1 for entry 255, as you've found out when starting the discussion about the right 16-to-8 conversion. (i + 128) / 257
is the rounding version with 129 on 0, 257 on 1 up to 254, and 129 on 255.
All other parts should have the same calculation, whatever that will be.
So, that will include the lines in iplab2rgb.cc
. Okay.
@Floessie I just had a look at the BABL way. For me it looks similar to (i+128)/257. Input 65535 - 127 = 65408 for example gives 255 as result. Or do I miss something?
@heckflosse GitHub should really include the :facepalm: emoji just for me ...
@iliasg
Ingo, you mean int8 = round(0_1_value * 255) .. no ?
But this issue is about navigator's pct values which should be (to_be_exported_int8/255)*100
I wanted to point out that other software works in different ranges (float [0;1] for example) and they convert from/to [0;255] using 255 as divisor/factor. If you translate this to the conversion from [0;65535] to [0;255] by making the [0;1] conversion in between (remember we're still in float space at that step) you'll get (x / 65535) * 255 = x / 257. To 'correct' the rounding errors introduced by the truncating integer conversion 128 is added before division by 257.
I agree that the original Issue was about the naviagtor's values, but the proposed fix also affected the output, which led to this large and interesting discussion I really appreciate :)
Ingo
@Floessie
GitHub should really include the :facepalm: emoji just for me ...
Not just for you. It would be useful for me too, especially when mixing integer and float arithmetics...
[Reproduction procedure]
I think all should be 100%.
[Versions] I see this with RT 4.2.1004 OSX 10.11.6 but I believe it is platform independent.
[Candidate fix] While 8bit image is loaded, pixel values, 0-255, are not fully mapped into the full range of internal values, 0-65535. The changes below fixes this problem in my own build. Theoretically, this change may degrade the performance upon the conversions, but I don't feel the difference while doing actual operations with the fixed build.