Beep6581 / RawTherapee

A powerful cross-platform raw photo processing program
https://rawtherapee.com
GNU General Public License v3.0
2.81k stars 317 forks source link

Xtrans support #2398

Closed Beep6581 closed 9 years ago

Beep6581 commented 9 years ago

Originally reported on Google Code with ID 2415

Today I started work at Xtrans support. This first patch only eliminates an endless
loop when trying to open a Xtrans file and a crash when trying to fast demosaic. I
set demosaic to 'nodemosaic' for Xtrans files until I get the xtrans_interpolate from
dcraw working in RT.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-08 16:31:46


Beep6581 commented 9 years ago
Ilias, maybe you can try and compile it yourself using the libs from the
links I gave you, and tdm-gcc 4.6.1 (it failed with 4.7.1)
Le 25 juin 2014 16:23, <rawtherapee@googlecode.com> a écrit :

Reported by sguyader on 2014-06-25 22:17:43

Beep6581 commented 9 years ago
Late night report from x-trans front:

I just made a nice speedup for xtrans_interpolate (at least at my system) :-)

Here the (rounded) timings for a 16 Mp x-trans-file on my 8-core AMD

xtrans_13.patch: 2000 ms
xtrans_14.patch: 1800 ms
xtrans_15.patch: 1400 ms (will post the patch tomorrow)

I also reduced the memory consumption compared to xtrans_14.patch (exact numbers follow
with next patch)

btw: I tried the method DarkTable uses instead of the cielab conversion inside the
x-trans interpolation. It's faster, but the result didn't convince me (color artifacts),
so I stayed with the cielab version, because I think we should prefer quality, not
maximum speed.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-25 23:53:38

Beep6581 commented 9 years ago
Great Ingo thanks your dedication, I agree that quality is a priority over
speed, and it already is at least as fast as simple dcraw conversion. Can't
wait and try your latest patch

Reported by sguyader on 2014-06-26 00:00:22

Beep6581 commented 9 years ago
Sebastien, optimization process still goes on :-)

Reported by heckflosse@i-weyrich.de on 2014-06-26 00:06:18

Beep6581 commented 9 years ago
@ sguyader #111 ..

http://rawtherapee.com/forum/viewtopic.php?f=10&t=5217&p=37847#p37847

Reported by iliasgiarimis on 2014-06-26 07:21:22

Beep6581 commented 9 years ago
In #112 I write that xtrans_15.patch needs 1400 ms, but after further optimization it's
down to 1250 ms at my system :-)

Memory consumption during x-trans interpolation is reduced by

c*(176*176*18*sizeof(float) - 128*sizeof(float)) 

where c is the number of cores. Means that it now per core needs 2230144 byte less
than xtrans_14.patch

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-26 12:31:35


Beep6581 commented 9 years ago
#116 Ingo I see the speed increase, though on my Core i5 laptop it takes quite a lot
more time than on your 8-core machine!

Besides that, when zooming at 200% on the borders I see interpolation artifacts of
about 2 rows/columns, you're probably aware of that but just in case you're not I prefer
to inform you. I don't see these artifacts in Darktable (which by the way is, I think,
slightly slower than RT at processing x-trans images).

Reported by sguyader on 2014-06-26 14:11:34

Beep6581 commented 9 years ago
Sebastien, there is a 6 pixel border which is interpolated by a simple interpolation.
Then RT cuts off 4 pixel from each side, so 2 are remaining. Perhaps Darktable cuts
off 6 pixels, so you don't see them. We could do that too. Can you compare width and
height of picture in Darktable vs. RT?

Reported by heckflosse@i-weyrich.de on 2014-06-26 16:25:57

Beep6581 commented 9 years ago
For an unknown reason, today I can't export a file from Darktable, it crashes. When
the image is open in DarkTable "image information" shows a size of 4936x3296, while
Rawtherapee shows a size of 4928x3288. So, the DT image is larger, and it is visible
at the edges of the images, as if RT was cropping, or DT was adding more pixels. I
put a screenshot of the lower left corner of a test image, viewed at 200% (https://drive.google.com/file/d/0B_AvPFlUj8t5VVJwUGZqMEZQVk0/edit?usp=sharing)

Reported by sguyader on 2014-06-26 17:13:38

Beep6581 commented 9 years ago
I'll have a a look where this difference in size comes from.

Reported by heckflosse@i-weyrich.de on 2014-06-26 17:45:36

Beep6581 commented 9 years ago
Difference is caused by the standard 4 pixel crop of RT. Will try to make better interpolation
at the borders. Here's the next patch, which reduces the processing time to 1100 ms
at my system.

Reported by heckflosse@i-weyrich.de on 2014-06-28 17:23:27


Beep6581 commented 9 years ago
Frankly Ingo i don't think 4 pixels at the edges make a juge difference.
It's more a question of perfectionism.

Reported by sguyader on 2014-06-28 18:19:43

Beep6581 commented 9 years ago
Sebastien, it's the same perfectionism as with getting maximum speed ;-)

Reported by heckflosse@i-weyrich.de on 2014-06-28 18:40:32

Beep6581 commented 9 years ago
I thank you for that!

Reported by sguyader on 2014-06-28 18:59:16

Beep6581 commented 9 years ago
Opened an RT/x-trans folder in my drive :)

https://drive.google.com/?tab=wo&authuser=0#folders/0B0NqktTgc54sVXZLOHNfcnhoZVU

There is a win32 build of RT4.1.22+xtrans_16.patch together with the dcp profile and
camconst.json (I forgot to add them in the build ..)

Reported by iliasgiarimis on 2014-06-28 21:15:06

Beep6581 commented 9 years ago
Cool Ilias. What is the difference  between 4.1.22 and 4.1.21?

Reported by sguyader on 2014-06-28 21:50:31

Beep6581 commented 9 years ago
Ilias, thank you for that build! Works fine here on my Win7/64 machine, though it's
about factor 1.5 slower than my native Win64-release (very interesting). I want to
mention, that the actual patch (and for this reason also the build) has a color cast
Issue with some DNG-files. Please ignore this and don't write any bug reports about
that. I'm aware of that Issue and will solve it for final patch.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-28 21:52:28

Beep6581 commented 9 years ago
Ingo, the x-trans interpolation takes 7secs, (C2D 3.0Ghz) much better than before !!.

As about the quality I think it is acceptable although there is much room for improvements.

( I have not compared with Dcraw yet ..)

Indeed as squyader says looks as been sharpened !!. A lot of aliasing ..

Missing is the "none" interpolation .. 

Is it possible to have a non-interpolated output of 3X3 tiles in the case we use ?.
Just use the average of the three channels in these 3X3 tiles like the 2X2 fast "interpolation"
of Dcraw.
This will give a low resolution but robust color rendering and could be used as reference
.. may be it could be mixed with the current result ;) in an additional pass ..

Reported by iliasgiarimis on 2014-06-28 21:54:04

Beep6581 commented 9 years ago
Ilias, average of 3x3 you get by selecting 'fast' as demosaic method with x-trans files.
Quality with non-fast (Select amaze to get the 3-pass dcraw method) using neutral profile
is equal to dcraw in my tests.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-28 22:01:06

Beep6581 commented 9 years ago
Ilias, I have to admit, that I'm a little spoiled by my 4 GHz 8 core bulldozer, but
even when I set it to use only one core for RT (which then runs with 4.2 Ghz) it needs
the same time (7 seconds) as your dual core using both cores. Maybe you should try
a native build for comparison.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-28 22:19:54

Beep6581 commented 9 years ago
For info, my 64-bit build takes about 4 seconds for 3-pass x-trans
interpolation on my Core i5 2.50 GHz laptop

Reported by sguyader on 2014-06-28 23:27:41

Beep6581 commented 9 years ago
I forgot to say my latest build is running on windows 7 64 bits, and was
compiled with patch 16. Regarding quality I find it on par with dcraw, both
are virtually indistinguishable. I don't see the "sharpened" look anymore
since earlier patches.

Reported by sguyader on 2014-06-28 23:34:32

Beep6581 commented 9 years ago
Sebastien, just for my information: How much cores has your Core i5?

Reported by heckflosse@i-weyrich.de on 2014-06-29 00:14:03

Beep6581 commented 9 years ago
It has 2 cores handling 4 threads in parallel. It's clocked at 2.5 Ghz but
has a turbo boost allowing to clock the processor at up to 3 GHz when the 2
cores are active or 3.2 when only 1 core is active

Reported by sguyader on 2014-06-29 01:42:11

Beep6581 commented 9 years ago
#129

hmm fast demosaic is terrible proving that my idea for 3X3 averaging does not work
as I hoped (terrible idea )

Now that I see it it is not same as 2X2 for bayer cfa, to have an equivalent (but less
resolution) effect we have to sample in the middle of the 2X2 green squares ... use
a 4X4 tile at a 3 pixels pitch and average the greens weighting the according to the
distance from the center i.e 1/3 for the corner greens (linear) or maybe 1/9 (inverse
square) ..

Reported by iliasgiarimis on 2014-06-29 08:20:01

Beep6581 commented 9 years ago
Here's a patch which applies to head, so please clone latest revision before applying
it. I added the following things for xtrans files:

Demosaic methods 'none' and 'mono'.

Support for 'Raw White Point Correction' and 'Raw Highlight Preservation' (last patches
crashed when 'Raw White Point Correction' was activated.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-29 19:18:35


Beep6581 commented 9 years ago
I forgot to mention that I included a speedup for 'Raw White Point Correction' when
'Raw White Point Correction' > 1.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-29 19:32:28

Beep6581 commented 9 years ago
Here's a short overview for the "initial Fujifilm X-Trans support" with patch 17:

1.) An optimized (OpenMP and other optimizations) version of xtrans_interpolate with
3 passes is used for zoom levels >= 100% and for full processing when 'Amaze' is selected
2.) A fast interpolation is used for zoom levels < 100% or when 'fast' is selected
3.) Support for demosaic 'none' and 'mono'
4.) When any other demosaic method is selected a 1-pass xtrans_interpolate is used.
This can be useful for 'fast export'
5.) Dark Frames and Flat Fields are supported (though Flat Field Method 'Vertical +
Horizontal' shows a color cast), but I guess (didn't test yet) the automatic selection
of Flat Field files dows not work, because
6.) Lens information is not correct in actual patch.
7.) Raw White & Black Points are supported
8.) False color suppression is supported
9.) Raw CA-correction is NOT supported and disabled in code, but not in gui
10.) 'Line noise filter' and 'Green Equilibration' will be disabled in code (but not
in gui) with the next patch.
11.) Exposure Auto Levels is supported
12.) Auto WB and Spot WB are supported
13.) Raw histogram is supported
14.) Rendering of two pixels border could be better
15.) Black levels from exif are used
16.) Thumbs from raw are supported
17.) all the other things I forgot to mention above are supported or not ;-)

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-29 22:07:55

Beep6581 commented 9 years ago
Ingo, such an impressive work! I wish I had the x-trans camera:)

Reported by michaelezra000 on 2014-06-29 23:49:41

Beep6581 commented 9 years ago
Michael, thank you. If you have some time to test this work, you can download some sample
images here: http://www.photographyblog.com/reviews/fujifilm_x_t1_review/sample_images/
Of course you can also use sample images of other cameras with x-trans sensor.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-30 00:06:37

Beep6581 commented 9 years ago
Excellent work Ingo, and thanks also to Ilias. It took you less than 1
month to adapt the code, and release 17 patches with optimisation each
time. Right now you've given to RT users the best open source raw processor
for xtrans images!
One question, do you think CA correction for xtrans files will fixed
someday?

Reported by sguyader on 2014-06-30 01:14:26

Beep6581 commented 9 years ago
Here is RT 4.1.30+x-trans17.patch for win32 users
https://drive.google.com/?tab=wo&authuser=0#folders/0B0NqktTgc54sUHZCRzQ0VGttRE0

Should we make our builds more accessible to the user's base. I think testing by many
users could be useful.

It now takes 6.2 sec on my C2D 3.0Ghz, 4GB DDR2 win vista32.

Reported by iliasgiarimis on 2014-06-30 09:13:03

Beep6581 commented 9 years ago
Awesome work Ingo! I'll soon give it a spin.

ilias: yes, that's why our website supports sorting builds by "stable" and "nightly",
though the interface for creating new accounts does not work for me or for whoever
tried it last time (Ingo or Torger).

Reported by entertheyoni on 2014-06-30 09:37:43

Beep6581 commented 9 years ago
Impressive work Ingo!

Now that the code is almost completely done, could we discuss about the GUI now?

I don't think we should mix the Bayer and X-Trans together. There could be 2 possible
solutions:
1. Create a separate Method for each sensor type (btw, is the "mono" method dedicated
to monocolor sensor, or does it work with classical Bayer sensor, like an automatic
conversion to B&W?)

2. Given that you have more channels in X-Trans sensors or less channels in Foveon
channels, I think it would be better to have separate "master tool", which would contain
duplicated and adapter subtool like white point, etc... This could be easily and quickly
done in the PP3, and would be more futire proof. You would not have to handle fallback
solutions neither.

The rawimage class could also better represent the sensor type, that's another story
that could be postponed, but that should be addressed if we want to support Foveon,
Fuji EXR, or other kind of sensor (lytro? :) )

I'll contact you on IRC tonight, if you want to discuss about that.

Reported by natureh.510 on 2014-06-30 11:07:40

Beep6581 commented 9 years ago
Ingo, here are my timings:

get_colorsCoeff took 0 ms
get_colorsCoeff took 17 ms
fast xtrans_interpolate took 133 ms
get_colorsCoeff took 0 ms
get_colorsCoeff took 19 ms
xtransborder_interpolate took 8 ms
xtrans_interpolate took 1439 ms

Results look good on visual inspection, but I cannot compare to latest camera raw,
as Adobe pushed Photoshop CC 2014 update which disabled latest camera raw on my machine,
so I am back to CS6. Once I get this resolved, would be interesting to compare, I will
update here.

Reported by michaelezra000 on 2014-06-30 11:20:23

Beep6581 commented 9 years ago
Hombre, I'll be there :-)

Reported by heckflosse@i-weyrich.de on 2014-06-30 11:23:57

Beep6581 commented 9 years ago
Michael, which patch did you use? Your timings are worse than mine, which is almost
impossible.

Reported by heckflosse@i-weyrich.de on 2014-06-30 11:26:21

Beep6581 commented 9 years ago
Here's a new patch. Almost the same as last one, but it allows to compare the original
version (with cielab conversion) to the version darktable uses (with yuv conversion).

Here's tha actual demosaic map:

Amaze : 3-pass using cielab    1100 ms
Igv:    3-pass using yuv        950 ms
lmmse:  1-pass using cielab     505 ms
eahd:   1-pass using yuv        435 ms

yuv is about 14% faster than cielab, but I prefer the result from the cielab version
(less colorful artifacts). On the other side for example the 1-pass yuv could be a
good choice for 'fast export'.

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-30 11:35:18


Beep6581 commented 9 years ago
Re #141: Sebastien, I think I can make a x-trans ca correct, but not for the next release,
because in about 2 weeks I'll go to spain for three weeks.

Reported by heckflosse@i-weyrich.de on 2014-06-30 11:57:12

Beep6581 commented 9 years ago
This screenshot http://www.i-weyrich.de/rt/xtrans/xtrans_cielab_yuv.png shows the difference
betwwen 3-pass cielab (left) and 3-pass yuv (right)

Reported by heckflosse@i-weyrich.de on 2014-06-30 14:58:37

Beep6581 commented 9 years ago
#148 Ingo I tried your latest patch. Indeed for 3-pass demosaicing, the cielab version
introduces less color artifacts than the yuv-based one. So for maximum quality "Amaze"
is preferable.
However I don't see much difference between cielab and yuv-based methods for 1-pass.

There's a strange beaviour with the display zoom (or maybe I don't understand how it
works):
- when I first set the zoom to >=100%, the 4 demosaic maps (Amaze, Igv, lmmse and eahd)
seem almost as good (still Amaze has the less color artifacts, but mostly at >=200%).
- when I decrease the zoom level at say, 50%, I still don't see much difference between
the 4 maps and the displayed image is good (it looks as if reducing the zoom actually
downscales the 100%-zoom demosaiced image); but then if at 50% I change the demosaic
method, whichever I select (including Amaze) it actually uses always the fast algorithm.

This behaviour was already present in earlier version (4.1.21 with patch 16) I just
had not seen this. I would expected to be able to see the result of Amaze demosaicing
directly when I select Amaze while working at <100% zoom. Right now to see this effect
at 50%, I have to first zoom at 100% (which switches to Amaze automatically) and then
switch back to 50%. But I can do that only if I don't plan to use other processing
filters such as hot/dead pixel filter, because if I switch it on, it switches again
to fast demosaicing method.

But I guess it's just a matter of getting used to do everything before, and then visualizing
the best demosaicing as the last step.

Reported by sguyader on 2014-06-30 15:03:36

Beep6581 commented 9 years ago
Sebastien, let my try to give a simple description:

1.) Independent on demosaic mehtod RT always demosaics the whole image and downscales
it for zoom levels < 100%
2.) When you open an image 'fast' is used.
3.) When you zoom to >= 100% the selected method is used.
4.) When you zoom back to < 100% the select method is used.
5.) When you are at zoom < 100% and change the method 'fast' is used again.

It's the same behaviour as for bayer sensors in RT.

About the 1-pass interpolation: I made some tries with high iso pictures and for those
the quality from 1-pass interpolation is almost as good (perhaps even a tiny bit better)
as the quality from 3-pass interpolation. But I tested only with real world high iso
shots. Do you have an high iso equivalent to the XT1-iso200.RAF?

Ingo

Reported by heckflosse@i-weyrich.de on 2014-06-30 15:55:54

Beep6581 commented 9 years ago
Ok Ingo I understand.
Here's a link to the test image at iso 6400: https://drive.google.com/file/d/0B_AvPFlUj8t5VS1ucGRSOUpfMFU/edit?usp=sharing

Reported by sguyader on 2014-06-30 16:00:54

Beep6581 commented 9 years ago
Sebastien, thank you for the image. It indeed shows that 3-pass cielab is best even
for high iso shots.

Reported by heckflosse@i-weyrich.de on 2014-06-30 16:45:03

Beep6581 commented 9 years ago
Ingo

I just tested your work! In french "c'est formidable" !

Jacques

Reported by jdesmis on 2014-06-30 18:44:43

Beep6581 commented 9 years ago
I'll have a break now for this Issue, as Hombre will make some changes to stuff x-trans
into gui the right way, I'll have a look at other Issues meanwhile and will come back
to this one, when the gui stuff is completed (I already have some new (but small) xtrans-speedups
on hold ;-)

Ingo

Reported by heckflosse@i-weyrich.de on 2014-07-01 00:04:31

Beep6581 commented 9 years ago
Ingo, will the GUI stuff be filed under this issue number? Otherwise, where can I follow
the progress on this?

Reported by sguyader on 2014-07-01 13:07:32

Beep6581 commented 9 years ago
Sebastien, yes, I think Hombre will use this Issue.

Reported by heckflosse@i-weyrich.de on 2014-07-01 13:28:10

Beep6581 commented 9 years ago
Short info from x-trans optimizing: Next patch (after Hombre's gui patch) needs 1000
ms at my system (was 1100 ms with last patch)

Reported by heckflosse@i-weyrich.de on 2014-07-01 20:31:26

Beep6581 commented 9 years ago
I just detected an error (crash) when using 'Raw White Point Correction' with values
> 1. Please ignore the error. I'll fix it with next patch.

Reported by heckflosse@i-weyrich.de on 2014-07-02 20:19:22