darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.91k stars 1.15k forks source link

drastic slow down of zoomable light table after GMIC integration in master #4114

Closed upegelow closed 4 years ago

upegelow commented 4 years ago

Describe the bug The zoomable light table (one of the view options in lighttable mode) has experienced a drastic slow down recently after integration of GMIC.

Panning and zooming to navigate within the lighttable used to be very fast, even with larger collections. Now it can be so slow that darktable practically freezes. Work is not possible any longer.

I bisected the issue and found to my surprise:

eb8a2bda321fa82bd92fd211ef46e1f88a10a5bc is the first bad commit
commit eb8a2bda321fa82bd92fd211ef46e1f88a10a5bc
Author: phweyland <philippe.weyland@libertysurf.fr>
Date:   Thu Dec 12 10:40:59 2019 +0100

    Squashed commit of the following:

      add compressed lut gmz files (GMIC)
      compressed lut is stored in params. sharing xmp file works without
    source lut now.
      use (default_user_folder)\gmic to store cache files (to speed
    up decompression)

:100644 100644 eef511391e60801f797eba2db2ea788586219dd9 5ada5322cf945c29d7e0a068186e12bc7d25b9c7 M      DefineOptions.cmake
:040000 040000 e62f680e64b81fab7fb3ad14173f5cf483acf5b2 6699b6b294d678233cda7f39334fd7ba0bdaabae M      cmake
:040000 040000 bd22a67945201970a7de18985994ce735474eb39 6a7a70f433fb2ebd9752ea2a3ddc28c7fd58255d M      data
:040000 040000 6c1d590bba3e494ef74e4a12362b4e18e34723ef 931e9b3c65f5d7c0b65aa675784ef947c61fbcbb M      packaging
:040000 040000 ef491e5b7e1ca1ff4eeac99de0a13f92228023bf 08eb417bfdcc03688b76973da5781e6882c91e3d M      src

To Reproduce

  1. Open a large collection within darktable (in my case about 1500 images, thumbs are already in cache)
  2. In lighttable view switch to "zoomable light table"
  3. Try to pan or zoom within the view
  4. Any action takes many seconds to complete (if darktable behaves normally, close it and start it again - the problem sometimes does not occur when visiting the collection for the first time).

Already tested: no influence from my existing library.db. I get the same behavior with a fresh setup.

Expected behavior The lighttable should be as fast as in 3.0. If GMIC stands in the way and this cannot be sorted out quickly GMIC integration should be made optional at compile time.

Platform (please complete the following information):

TurboGit commented 4 years ago

No big deal see:

4008 and #4103.

upegelow commented 4 years ago

No big deal see:

It is a big deal if your plan is to remove the zoomable light table. I am absolutely against.

phweyland commented 4 years ago
* OS: OpenSUSE Leap 15.1

I thought the gmic library was broken on opensuze. see #4067. Which version of libgmic do you have ? Have you built dt ? Any strange issue there ? Any message on the console when dt is running ?

1. (in my case about 1500 images, thumbs are already in cache)

I'm not a user of zoomable lighttable (I could become :-)), but I've tested your use case (with an 8Gb memory machine) and I don't find any speed difference with a collection of 150 images, once images are in cache.

upegelow commented 4 years ago

Which version of libgmic do you have ?

libgmic-devel-2.8.1-lp151.10.1.x86_64

Have you built dt ? Any strange issue there ? Any message on the console when dt is running ?

FWIW attached you find the output of darktable -d all. All I do within that one minute is trying to move and pan the lighttable. Everything is extremely lagging behind, reaction times in the range of several seconds on any action. Only towards the end of the exercise the lighttable reacts somewhat faster. darktable.out.txt I know that -d all is normally not of great help. But maybe the timestamp in each line might give an indication where performance is lost.

Any issues that could come from the mix of C++ code (garbage collection etc.) into the C world of darktable?

phweyland commented 4 years ago

Any issues that could come from the mix of C++ code (garbage collection etc.) into the C world of darktable?

In lighttable libgmic is not activated if you have not made use of lut3d with compressed luts. It could be an issue like that if you were on darkroom on lut3d using gmz file ...

darktable -d all

I don't see any issue with lut3d in your file. I've seen 2 images with that module, but no warning nor error message. Nothing about libgmic either (and not the issue #4067)

I see some difference between images: 37,185879 [lighttable] image expose took 0,0005 sec and 35,739927 [lighttable] image expose took 0,0236 sec The difference is huge to display thumbnails, but I don't know if that means anything.

Work is not possible any longer.

If you want a quick workaround you can uninstall libgmic and rebuild dt (after deleting build folder content). You would be free of this library. But it would be better to understand the issue you have. At least that would give some input about the role of libgmic.

pitbuster commented 4 years ago

Any issues that could come from the mix of C++ code (garbage collection etc.) into the C world of darktable?

C++ doesn't have garbage collection and usually projects doesn't automagically get slower just by introducing C++ code.

upegelow commented 4 years ago

After deleting libgmic-devel, rm -r ./buildand recompilation of current master everything is as it should, i.e. lighttable runs fast.

upegelow commented 4 years ago

Here is the output of -d all after deleting libgmic-devel. darktable.out.txt Image exposet is at 0,0005 sec which seems to be the normal on my machine.

phweyland commented 4 years ago

darktable.out.txt

I've tried to compare the 2 log files. The only difference I've found so far is about the [lighttable] image expose time (0,0005 sec -> more than 0,0250, but not always).

How much memory has your machine ? Do (did) you see significant swap running dt ?

TurboGit commented 4 years ago

@upegelow : Sorry, I sent a message on the dev mailing list (to get feedback) pointing to the discussion about removing broken storages and the zoomanble lt which is very very difficult to maintain. Everybody who responded said that they were not using the zoomable lt so yes it was decided to go ahead and removing it as part of the big rewrite drived by @AlicVB.

So it seems that we need to reconsider this decision and see how we can manage keeping you and the maintainers of this code happy :)

upegelow commented 4 years ago

How much memory has your machine ? Do (did) you see significant swap running dt ?

16GB and no swapping.

phweyland commented 4 years ago

How much memory has your machine ? Do (did) you see significant swap running dt ?

16GB and no swapping.

plenty of space then.

phweyland commented 4 years ago

Any issues that could come from the mix of C++ code (garbage collection etc.) into the C world of darktable?

C++ doesn't have garbage collection and usually projects doesn't automagically get slower just by introducing C++ code.

libgmic library has one (but I don't know the detail).

upegelow commented 4 years ago

Some further insights: there is no general slowdown of darktable. Exporting an image on CPU takes the same time regardless if gmic is loaded or not.

Some other thought. On my system the libgmic-devel install includes a file /usr/lib64/cmake/gmic/GmicTargets.cmake that seems to pull a ton of further libraries:

set_target_properties(libgmic PROPERTIES
  INTERFACE_LINK_LIBRARIES "/usr/lib64/libSM.so;/usr/lib64/libICE.so;/usr/lib64/libX11.so;/usr/lib64/libXext.so;/usr/lib64/libtiff.so;/usr/lib64/libpng.so;/usr/lib64/libz.so;/usr/lib64/libjpeg.so;GraphicsMagick++;GraphicsMagick;IlmImf-2_4;Imath-2_4;Half-2_4;Iex-2_4;IexMath-2_4;IlmThread-2_4;opencv_dnn;opencv_face;opencv_ml;opencv_objdetect;opencv_shape;opencv_stitching;opencv_superres;opencv_videostab;opencv_calib3d;opencv_features2d;opencv_highgui;opencv_videoio;opencv_imgcodecs;opencv_video;opencv_photo;opencv_imgproc;opencv_flann;opencv_core;/usr/lib64/libz.so;/usr/lib64/libcurl.so;fftw3;-lgomp;-lfftw3_threads;-lpthread"
)
AlicVB commented 4 years ago

@upegelow : I've tried to reproduce with no luck atm. What I've done to try to find where we spend the time is to put code like this one :

end = dt_get_wtime();
if(darktable.unmuted & DT_DEBUG_PERF)
    dt_print(DT_DEBUG_LIGHTTABLE, "[lighttable] IMAGE 1 took %0.04f sec\n", end - start);
start = dt_get_wtime();

everywhere in dt_view_image_expose and run dt with -d perf -d lighttable

So it will be easier to find the problematic part. I personally suspect it's somewhere after https://github.com/darktable-org/darktable/blob/1befa7d540b88db7e947f17dfe51501f9151b3b5/src/views/view.c#L1250 (and yes, there's certainly better ways for debugging ;))

phweyland commented 4 years ago

Some other thought. On my system the libgmic-devel install includes a file /usr/lib64/cmake/gmic/GmicTargets.cmake that seems to pull a ton of further libraries:

set_target_properties(libgmic PROPERTIES
  INTERFACE_LINK_LIBRARIES "/usr/lib64/libSM.so;/usr/lib64/libICE.so;/usr/lib64/libX11.so;/usr/lib64/libXext.so;/usr/lib64/libtiff.so;/usr/lib64/libpng.so;/usr/lib64/libz.so;/usr/lib64/libjpeg.so;GraphicsMagick++;GraphicsMagick;IlmImf-2_4;Imath-2_4;Half-2_4;Iex-2_4;IexMath-2_4;IlmThread-2_4;opencv_dnn;opencv_face;opencv_ml;opencv_objdetect;opencv_shape;opencv_stitching;opencv_superres;opencv_videostab;opencv_calib3d;opencv_features2d;opencv_highgui;opencv_videoio;opencv_imgcodecs;opencv_video;opencv_photo;opencv_imgproc;opencv_flann;opencv_core;/usr/lib64/libz.so;/usr/lib64/libcurl.so;fftw3;-lgomp;-lfftw3_threads;-lpthread"
)

This list is really the list of dependencies of gmic. This side seems ok. But some of them are common with dt. Would not there be here some library conflicts ? With one that view_image_expose needs in particular ?

upegelow commented 4 years ago

I think I need to do a session with gperftools in order to get closer to the issue. I hope to find time on Friday.

upegelow commented 4 years ago

All time is lost in the following section:

https://github.com/darktable-org/darktable/blob/e56b0e720c23644e8609044f3b75c57be404e96a/src/views/view.c#L1291-L1312

This is regardless of color management yes or no. However, it's closely linked to using OPENMP here: if I skip OPENMP in that part all runs normal.

For comparison:

This is for thumbnails of about 160 x 110 pixels.

Now anybody can explain what role gmic plays here and if this could happen in other code places as well?

TurboGit commented 4 years ago

@upegelow : As this indeed makes no sense let's explore one possibility.

Do you have a Linux kernel 5.3 ? It has been reported to be buggy and could produce large slowdown. Maybe the key in this story ?

upegelow commented 4 years ago

Linux version 5.1.14-lp151.5-default (geeko@buildhost) (gcc version 7.4.0 (SUSE Linux)) #1 SMP Sat Jun 22 11:55:38 UTC 2019 (f291042)

AlicVB commented 4 years ago

if this could happen in other code places as well?

If you have some time, can you by any chance do timing tests with an iop where openmp is used (with and without gmic). With opencl deactivated of course ;) So we can see if gmic break somehow all OPENMP uses or just this one...

upegelow commented 4 years ago

Here is a test in the tonecurve module:

OPENMP on:

pegelow@zaphod:~> /usr/local/test/bin/darktable --disable-opencl
[tonecurve] image processing took 0.0005 sec
[tonecurve] image processing took 0.0133 sec
[tonecurve] image processing took 0.0193 sec
[tonecurve] image processing took 0.0202 sec
[tonecurve] image processing took 0.0158 sec
[tonecurve] image processing took 0.0207 sec
[tonecurve] image processing took 0.0229 sec
[tonecurve] image processing took 0.0308 sec
[tonecurve] image processing took 0.0181 sec
[tonecurve] image processing took 0.0309 sec
[tonecurve] image processing took 0.0165 sec
[tonecurve] image processing took 0.0292 sec
[tonecurve] image processing took 0.0287 sec

OPENMP off:

pegelow@zaphod:~> /usr/local/test/bin/darktable --disable-opencl
[tonecurve] image processing took 0.0016 sec
[tonecurve] image processing took 0.0699 sec
[tonecurve] image processing took 0.0802 sec
[tonecurve] image processing took 0.0706 sec
[tonecurve] image processing took 0.0766 sec
[tonecurve] image processing took 0.0730 sec
[tonecurve] image processing took 0.0851 sec
[tonecurve] image processing took 0.0726 sec
[tonecurve] image processing took 0.0833 sec
[tonecurve] image processing took 0.0802 sec
[tonecurve] image processing took 0.0891 sec
[tonecurve] image processing took 0.0716 sec
[tonecurve] image processing took 0.0823 sec

As it should be.

TurboGit commented 4 years ago

That's probably one of the strangest issue I have seen :( Sorry no more idea at this point.

One more thing I would check is the shared libraries used in both cases. Maybe libgmic compiled in your distrib is using a different OpenMP library? Mixing would cause this slow down ? Or maybe it is compiled with another version of GCC ?

upegelow commented 4 years ago

Maybe an issue with libjpeg:

ldd-comparison.pdf

upegelow commented 4 years ago

Maybe an issue with libjpeg:

No, even if I use the same jpeg library in darktable as in gmic the slow-down remains.

parafin commented 4 years ago

Out of the interest - how much time does expose take with openmp and without gmic? For such small buffer sizes openmp probably adds more overhead than speed gain.

gmic uses openmp internally too, but it should matter only if gmic code runs at the same time as thumbnail expose is happening. Do you have any pictures developed with compressed LUTs in the displayed collection?

upegelow commented 4 years ago

Do you have any pictures developed with compressed LUTs in the displayed collection?

I don't think so. At least not willingly. FWIW on my system I have a range of LUTs installed, most of them with extension .png and a few with extension .3dl.

upegelow commented 4 years ago

Out of the interest - how much time does expose take with openmp and without gmic?

In the range of 0.0005 seconds.

parafin commented 4 years ago

So removing openmp pragma there doesn't hurt in this test case, but this is quite general code, so at different zoom level, I guess, openmp can make things faster. And anyway, as you mentioned, this effect can appear in some place else, so better debug it where it's reproducible.

My only theory is that someone from the gmic library dependencies screws up openmp somehow, there's quite a lot of them in your list. Looking over the code of gmic itself I couldn't find anything interesting. I would suggest to try to compile gmic by yourself and try to disable its dependencies one-by-one. Also you can try disabling openmp in gmic.

upegelow commented 4 years ago

I'm not sure I want to wade through that thicket. I have no experience compiling gmic and it would require tons of devel libraries. In the end I can escape the problem by just compiling darktable without libgmic-devel. Not having support for compressed LUTs is acceptable for me.

Let's then hope that this issue is specific to my system. Other users will probably not start deeper investigations and just assume that darktable sucks :(

I am not too much into this 3Dlut topic. But as far as I see we integrate gmic just to have support for compressed LUTs. On the downside this brings additional 60+ dependencies which darktable does not have by its own. To me this looks off-balance. In cases like that I'd challenge the (ususally good) decision to use the external package as it is. If this would be my work area I would at least check how to extract the needed code parts from gmic and how to implement them in darktable directly.

parafin commented 4 years ago

To be fair to gmic, almost all of those dependencies are optional, that's why I suggested to turn them off. That's just how it is packaged for your OS.

phweyland commented 4 years ago

But as far as I see we integrate gmic just to have support for compressed LUTs. On the downside this brings additional 60+ dependencies which darktable does not have by its own. To me this looks off-balance. In cases like that I'd challenge the (ususally good) decision to use the external package as it is. If this would be my work area I would at least check how to extract the needed code parts from gmic and how to implement them in darktable directly.

Makes a lot of sense. Just few highlights:

So, in fact we use 3 gmic features: compressed lut file format (gmz), decompression and 3d image format (cimgz). Beside the difficulty of extracting and translating the code, the hard point here will be to maintain future compatibility.

TurboGit commented 4 years ago

If this would be my work area I would at least check how to extract the needed code parts from gmic and how to implement them in darktable directly.

I don't agree with that as this will mean that all the maintenance will be on the darktable side. The code on the GMIC project will be changed, enhanced, tuned... and we don't want to reintegrate this all the time.

A more promising solution would be to add GMIC as a new sub-module and compile it with the limited support just needed by darktable so probably avoiding lot of dependencies.

phweyland commented 4 years ago

A more promising solution would be to add GMIC as a new sub-module and compile it with the limited support just needed by darktable so probably avoiding lot of dependencies.

Let's make a try.

upegelow commented 4 years ago
  • the goal of using compress luts is to be able to keep it inside image parameters (then the xmp file is enough to share your developed image).

But then what documentation do we have on that data structure (gmz file)? If we integrate it inside image parameters we give de facto an eternity guarantee: whatever will come, future versions of darktable will be able to fully reproduce your current results.

I don't agree with that as this will mean that all the maintenance will be on the darktable side. The code on the GMIC project will be changed, enhanced, tuned... and we don't want to reintegrate this all the time.

I hope this is not true for the gmz format. Else see above.

phweyland commented 4 years ago

But then what documentation do we have on that data structure (gmz file)? If we integrate it inside image parameters we give de facto an eternity guarantee: whatever will come, future versions of darktable will be able to fully reproduce your current results

Not exactly... In parameters we keep the compressed lut data. gmz file is only there to read the first time ready to use compressed lut. If gmz format disappears one won't be able to read new compressed luts, but the compressed lut used for an image will stay within params.

TurboGit commented 4 years ago

I hope this is not true for the gmz format. Else see above.

I'm speaking of maintenance, speed improvement, support for new format... etc. No one in the GMIC side will break the current .gpz for fun :) And anyway, having a Git sub-module does not mean we will have to use the latest master version. A sub-module is properly managed by Git and we will of course use a specific version and will upgrade only when it makes sense and only after testing.

TurboGit commented 4 years ago

BTW, this is exactly what we are doing for rawspeed for example.

parafin commented 4 years ago

distro maintainers won't be happy. So we will end up where we started - they will rip out gmic git submodule and link to the system library (with the same dependencies) instead.

TurboGit commented 4 years ago

@parafin : as we've done with the OpenCL headers. It will be optional.

phweyland commented 4 years ago

When I started to work on this, I added gmic.h in dt code. I don't remember I needed any other library to compile and use it. With my current first test the build is ok (including gmic.h). But I've a run time crash with that message when I try to use a gmz file:

$ /opt/darktable/bin/darktable
/opt/darktable/bin/darktable: symbol lookup error: /opt/darktable/lib/darktable/plugins/liblut3d.so: undefined symbol: _ZN4gmicC1Ev

Why I don't see this at build time ?

parafin commented 4 years ago

Because DT is built without -Wl,--no-undefined flag. iop plugin is a shared library, by default it is assumed, that it's OK to have unresolved symbols in it. gmic is not a header only library.

parafin commented 4 years ago

@parafin : as we've done with the OpenCL headers. It will be optional.

And how will this idea help then, if most users use packages from distributions which will use system library and therefore this bug will be present for them?

TurboGit commented 4 years ago

And how will this idea help then, if most users use packages from distributions which will use system library and therefore this bug will be present for them?

I don't now and we don't know as we don't have a single idea about the issue at the moment. It is happening for a single user at the moment. At least we will be able to provide a version with a very minimal version of GMIC built and so with less dependencies for all people building from sources. And I hope this will solve @upegelow issue and maybe it will give us an idea about what is going on.

I have nothing better to propose at the moment.

phweyland commented 4 years ago

Because DT is built without -Wl,--no-undefined flag.

Just removed them. But yet no warning / error at build time (and still crash :-)). Something is missing ...

About dependencies, looking at the Makefile (if I understand correctly) only zlib library is mandatory.

Makefile.txt

parafin commented 4 years ago

Because DT is built without -Wl,--no-undefined flag.

Just removed them. But yet no warning / error at build time (and still crash :-)). Something is missing ...

Not sure what have you removed, but that flag is missing, so you should have added it, not removed. You probably changed if(WIN32) part.

parafin commented 4 years ago

And how will this idea help then, if most users use packages from distributions which will use system library and therefore this bug will be present for them?

I don't now and we don't know as we don't have a single idea about the issue at the moment. It is happening for a single user at the moment. At least we will be able to provide a version with a very minimal version of GMIC built and so with less dependencies for all people building from sources. And I hope this will solve @upegelow issue and maybe it will give us an idea about what is going on.

I have nothing better to propose at the moment.

If the idea is to have a minimal build of gmic, then you don't need to include it in dt sources, just do a minimal build of gmic and install it instead of system version;)

phweyland commented 4 years ago

Not sure what have you removed, but that flag is missing,

My bad! So I have to find out where to put it ...

phweyland commented 4 years ago

If the idea is to have a minimal build of gmic, then you don't need to include it in dt sources, just do a minimal build of gmic and install it instead of system version;)

But in that case if the user needs to use gmic as an application we get back the compatibility issue.

My idea currently is to add the code (normally with no or very few dependencies) to dt. So it should run whatever the context. And to keep the possibility to use DefineOptions.make to remove USE_GMIC if we don't want it. I can let the conditional code to let lut3d works without GMIC.

parafin commented 4 years ago

Which problem are we trying to fix here?

If it's the bug with lighttable slowdown - then we should debug it first, not make some decisions already. @upegelow is unwilling to help right now, so I would suggest to wait for someone to reproduce the issue who is able to run the tests I proposed. Custom build of gmic was suggested as a test, not as a solution. At best it's a workaround.

If we are talking about the amount of dependencies gmic pulls and we want to reduce them, then I guess it's a question for packagers of libgmic. In Gentoo it is solved by USE-flags, and in binary distributions it's a foundational limitation. Copying code doesn't sound like a good idea, and will be actively opposed by various distributions due to their policy on bundled libraries.