darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.8k stars 1.14k forks source link

Caching HLR output is not working #13184

Closed darkbasic closed 1 year ago

darkbasic commented 1 year ago

NOTE: this was originally a FR, which I converted to a bug report. I didn't want to complain about the slowness because I'm using a very demanding algorithm with very slow parameters on a non optimized architecture: it's ok to be slow. I wanted to ask if it was possible to compute only once and cache the results and I honestly didn't know caching was already implemented. Apparently something is wrong with caching on my machine.

To Reproduce

  1. Open a photo in darkroom
  2. Enable HLR
  3. Set method to guided laplacians
  4. Set crazy slow parameters like 64 iterations and 2048px diameter
  5. Wait for it to compute
  6. Create a snapshot
  7. Activate the snapshot
  8. Watch darktable recompute everything (all cores to 100% plus lots of noise from the fans)

Expected behavior Once computed it should reuse the cached data and not recompute everything from scratch. Snapshots are just an example but I can provide other examples where it doesn't use the cached content and it recomputes everything from scratch. With slow algorithms like GL it's easy no notice.

Platform darktable version : 4.3.0+12~g36a296a1b-dirty OS : Linux - kernel 6.0.14 Linux - Distro : Gentoo ppc64le (4K page size) Memory : 64 GB Graphics card : AMD RX 570 Graphics driver : mesa git karolherbst/mesa:rusticl/si OpenCL installed : rusticl OpenCL activated : no Xorg : Wayland Desktop : Gnome 43 GTK+ : 3.24.35 gcc : 11.3.1_p20221209 cflags : -O2 -pipe -mcpu=power9 -mtune=power9 CMAKE_BUILD_TYPE : RELWITHDEBINFO

Old feature request:

Is your feature request related to a problem? Please describe. I like the new scene-referred workflow but sometimes highlight reconstruction can be troublesome. With sunsets I don't like inpaint opposed because it flattens the colors: the warm sky gets replaced by pure white. Filmic RGB's highlight reconstruction isn't enough because of the magenta highlights, even with full bloom and gray balances. At this point the only options is Guided Laplacians, but to get rid of the magenta highlights sometimes you end up with crazy things like 64 iterations + 512 px diameter + 100% inpaint a flat color. Don't get me wrong: the result is perfect but it takes several minutes to elaborate. What's even worse, every time you switch back between photo, you take a snapshot or make another modification it freezes for several minutes.

Describe the solution you'd like Guided Laplacians take time to elaborate and I'm fine with that, but since HLR is the first element of the pixelpipe it makes more sense to cache its output so that subsequent modules can work on top of the cached data.

Alternatives The only viable alternative for me is to disable the HLR module entirely and re-enable it before exporting (and then re-disable it), which is quite annoying.

Additional context I'm on a ppc64le workstation and I had to temporarily disable OpenCL due to a bug in Rusticl for radeonsi, thus my codepaths are probably less than optimized. Nevertheless it's a 64 threads machine and probably faster than most laptops despite the not-so-optimized code paths. Guided laplacians are simply too slow to not be cached.

Nilvus commented 1 year ago

Please provide asked informations. You miss one of the most important one: darktable release you use. Guided laplacians are quite unusable for me as they need too much resources (that was you means by your issue). Cache could be an alternative but since 4.2.0 released yesterday, there's now 2 new HLR modes just better than old other modes and faster than guided laplacians. They are also works really great and are efficient on nearly all images. Just few ones could need something like guided laplacians now.

darkbasic commented 1 year ago

I'm on git master. I'm not 100% sure the new HLR modes were effective enough to replace guided Laplacians, but I'll try them again and let you know.

jenshannoschwalm commented 1 year ago

It is my understanding that darktable already uses cache for the subsequent modules.

I does. The size of the cache depends on the memory available though.

darkbasic commented 1 year ago

It is my understanding that darktable already uses cache for the subsequent modules.

I does. The size of the cache depends on the memory available though.

Apparently it doesn't work on my machine. I have 64 GBs of ram and I've set resources to large.

To Reproduce

  1. Open a photo in darkroom
  2. Enable HLR
  3. Set method to guided laplacians
  4. Set crazy slow parameters like 64 iterations and 2048px diameter
  5. Wait for it to compute
  6. Create a snapshot
  7. Activate the snapshot
  8. Watch darktable recompute everything (all cores to 100% plus lots of noise from the fans)

I didn't change the zoom level.

TurboGit commented 1 year ago

@darkbasic : That's not a cache issue in pixelpipe but in internal snapshot module. See #13179.

darkbasic commented 1 year ago

@TurboGit unfortunately as soon as I activate the snapshot is still recomputes the whole image:

159.871904 [histogram] took 0.003 secs (0.003 CPU) scope draw
160.011877 [histogram] took 0.003 secs (0.003 CPU) scope draw
160.056364 [dev] took 0.000 secs (0.000 CPU) to load the image.
160.087893 [dev] took 0.000 secs (0.000 CPU) to load the image.
160.248756 [dev_pixelpipe] took 0.003 secs (0.000 CPU) initing base buffer [full]
160.256918 [dev_pixelpipe] took 0.008 secs (0.115 CPU) [full] processed `raw black/white point' on CPU, blended on CPU
160.262386 [dev_pixelpipe] took 0.005 secs (0.037 CPU) [full] processed `white balance' on CPU, blended on CPU
194.820453 [dev_pixelpipe] took 34.558 secs (1005.603 CPU) [full] processed `highlight reconstruction' on CPU, blended on CPU
195.298744 [dev_pixelpipe] took 0.478 secs (13.209 CPU) [full] processed `demosaic' on CPU, blended on CPU
195.821629 [dev_pixelpipe] took 0.523 secs (14.285 CPU) [full] processed `denoise (profiled)' on CPU, blended on CPU
195.935992 [dev_pixelpipe] took 0.114 secs (2.718 CPU) [full] processed `lens correction' on CPU, blended on CPU
196.082756 [dev_pixelpipe] took 0.147 secs (2.423 CPU) [full] processed `chromatic aberrations' on CPU, blended on CPU
196.085427 [dev_pixelpipe] took 0.003 secs (0.000 CPU) [full] processed `exposure' on CPU, blended on CPU
196.134734 [dev_pixelpipe] took 0.049 secs (0.498 CPU) [full] processed `tone equalizer' on CPU, blended on CPU
196.143219 [dev_pixelpipe] took 0.008 secs (0.213 CPU) [full] processed `input color profile' on CPU, blended on CPU
image colorspace transform Lab-->RGB took 0.007 secs (0.136 CPU) [channelmixerrgb ]
196.208869 [dev_pixelpipe] took 0.066 secs (1.523 CPU) [full] processed `color calibration' on CPU, blended on CPU
197.248485 [dev_pixelpipe] took 1.040 secs (26.016 CPU) [full] processed `diffuse or sharpen out of focus subject' on CPU, blended on CPU
197.836136 [dev_pixelpipe] took 0.588 secs (12.924 CPU) [full] processed `diffuse or sharpen everything else' on CPU, blended on CPU
198.055327 [dev_pixelpipe] took 0.219 secs (5.801 CPU) [full] processed `color balance rgb' on CPU, blended on CPU
198.164259 [dev_pixelpipe] took 0.109 secs (2.928 CPU) [full] processed `filmic rgb' on CPU, blended on CPU
image colorspace transform RGB-->Lab took 0.007 secs (0.220 CPU) [colorout ]
198.184923 [dev_pixelpipe] took 0.021 secs (0.529 CPU) [full] processed `output color profile' on CPU, blended on CPU
198.196136 [dev_pixelpipe] took 0.011 secs (0.247 CPU) [full] processed `display encoding' on CPU, blended on CPU
198.200091 [dev_process_image] pixel pipeline processing took 37.957 secs (1089.080 CPU)
198.794743 [histogram] took 0.003 secs (0.003 CPU) scope draw
198.931605 [histogram] took 0.003 secs (0.003 CPU) scope draw

I'm on 4.3.0+18~gf45cb8e6d-dirty + your patch.

darkbasic commented 1 year ago

Even switching between two duplicates of the same shot in darkroom triggers recomputation. I have enable disk backend for full preview cache enabled.

TurboGit commented 1 year ago

@TurboGit unfortunately as soon as I activate the snapshot is still recomputes the whole image:

That's certainly not what I see on my side, strange.

TurboGit commented 1 year ago

I have enable disk backend for full preview cache enabled.

That's not in the equation... The cache is internal to the snapshot module where the actual cairo image surface is kept and used to display it without the need to recompute anything. Note that the compute is done at least once and each time the zoom is changed or if some panning is done. But after that, selecting / un-selecting the snapshot just display it and no recomputation is done.

darkbasic commented 1 year ago

Note that the compute is done at least once and each time the zoom is changed or if some panning is done.

I didn't zoom nor pan. I've simply clicked on take snapshot (no recomputation happens) and then I activated the snapshot (which recomputes everything).

That's not in the equation...

Shouldn't pertain to the case where I switch between different shots in darkroom? If it keeps the full preview cache it shouldn't recompute the next time I reopen the same shot.

TurboGit commented 1 year ago

I didn't zoom nor pan. I've simply clicked on take snapshot (no recomputation happens) and then I activated the snapshot (which recomputes everything).

Yes, first time of course. This is needed. But next time you don't have this recomputation.

TurboGit commented 1 year ago

Shouldn't pertain to the case where I switch between different shots in darkroom?

Again no, as I said (now multiple time) the cache is in the module itself (in memory) and has nothing to do with thumb caching.

darkbasic commented 1 year ago

Yes, first time of course. This is needed. But next time you don't have this recomputation.

In such case you patch does indeed what it's supposed to. I thought that the first computation wasn't needed because that was the current pixelpipe in darkroom but apparently the snapshot module isn't able to take advantage of that.

parafin commented 1 year ago

I think totally matching snapshot is not a useful use-case to optimize for;)

darkbasic commented 1 year ago

I think totally matching snapshot is not a useful use-case to optimize for;)

Isn't supposed to be always totally matching? In order to create a snapshot I have to first click on some point at the history, which recomputes the image at that point -> snapshot is totally matching.

parafin commented 1 year ago

I mean displaying snapshot over the same image (in the same state) as was snapshot. Usually you compare with some different state, at which point you can't use existing pipeline for snapshot.

darkbasic commented 1 year ago

I mean displaying snapshot over the same image (in the same state) as was snapshot. Usually you compare with some different state, at which point you can't use existing pipeline for snapshot.

Sure, what I meant was copying the current pipeline cache into the snapshot module (so that any further modification won't overwrite it) but I have no idea how easy/feasible that would be.