darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.65k stars 1.13k forks source link

Darktable 3.4: OpenCL+Local Contrast corrupts image preview #7544

Closed hp48gx closed 1 year ago

hp48gx commented 3 years ago

Describe the bug

activating openCL and "local contrast" module corrupts the displayed image.

To Reproduce

  1. fresh installation of Darktable 3.4 on macOS/Mojave.
  2. configure processing to "scene-referred" (not sure this is relevant)
  3. configure openCL as described below
  4. import a couple of images
  5. go to darkroom, activate "Local Contrast" module and play with the detail level (in laplacian mode).

if OpenCL is off, everything works fine. if OpenCL is on + scheduling profile: default, the image in the center of the screen gets corrupt if OpenCL is on + scheduling profile: very fast GPU, the main image is generally fine but the small preview on the left and the thumbnail in the filmstrip becomes corrupt

note that to see a corrupt image, you may need to move the detail slider a few times and/or to click on a different picture and back. Closing darktable and reopening it has no effect. I am seeing this on some ARW images.

Screenshots

corrupt1 corrupt2 ok

Platform (please complete the following information):

parafin commented 3 years ago

This is probably a duplicate of #7488 and so "won't fix" (or rather "can't fix").

MStraeten commented 3 years ago

did you ever tried this with darktable 3.2.1? please run /Applications/darktable.app/Contents/MacOS/darktable-cltest | grep '\[opencl_init\] device ' in a terminal to see the gpu configuation.

hp48gx commented 3 years ago

it's a dual-gpu

$ /Applications/darktable.app/Contents/MacOS/darktable-cltest | grep 'opencl_init] device'

0.048061 [opencl_init] device 1 `Iris Pro' supports image sizes of 16384 x 16384
0.048063 [opencl_init] device 1 `Iris Pro' allows GPU memory allocations of up to 384MB
[opencl_init] device 1: Iris Pro 
0.061064 [opencl_init] device 2 `GeForce GT 750M' doesn't have sm_20 support.
0.061070 [opencl_init] device 2 `GeForce GT 750M' supports image sizes of 16384 x 16384
0.061073 [opencl_init] device 2 `GeForce GT 750M' allows GPU memory allocations of up to 512MB
[opencl_init] device 2: GeForce GT 750M 
MStraeten commented 3 years ago

seems to be at least a late 2013 retina macbook pro (i had such thing with the same discrete GPU and it was fine with opencl and dt 2.4 and 2.6). Do you have a current nvidia cuda drivers installed? I'm pretty shure it should be able to be used with OpenCL

hp48gx commented 3 years ago

it's a standard (but fully up-to-date) macos installation. no additional nvidia driver was installed. do you think I should install something else?

MStraeten commented 3 years ago

Did you had darktable 3.2.1 installed? was it ok there? Maybe install even an older 3.0 or 2.6 revision (you need to rename your ~/.config/darktable directory first to start with a clean config.) I'm not sure, if the cuda driver is needed - at least i had it installed. maybe you need to tweak opencl setting in ~/.config/darktable/darktablerc - i found an old issue from me (#2384) with a crashing opencl setting: changing "opencl_number_event_handles" from 25 to 100 helped me

parafin commented 3 years ago

There are no CUDA or Nvidia drivers on macOS nowadays as far as I know... I suppose these are Apple drivers, no?

MStraeten commented 3 years ago

i had mine from https://www.nvidia.com/de-de/drivers/cuda/mac-driver-archive/

MStraeten commented 3 years ago

unfortunately Apple is no longer focused on OpenCL support (quite old https://support.apple.com/en-us/HT202823 doesn’t contain MBP 16 ...)

hp48gx commented 3 years ago

I think I can test 3.2.1 on a (decently powerful) windows machine. I'll let you know how it goes.

MStraeten commented 3 years ago

a windows machine doesn't help to give an indication how it works on osx ;). In general Local Contrast is fine with OpenCL - at least it works fine on my OSX configuration - so it’s important to get as much infos as possible if it doesn't work on one machine. Even different used build tools can be the cause. So first is to exclude a general problem with openCL on the machine: So if 3.2.1 or an older version was fine with OpenCL then it’s a hint that there's something in the libraries that breaks things for an older OSX version...

MStraeten commented 3 years ago

i was able to run dt 3.4 on a Late 2013 Macbook Pro 15 with Intel Iris Pro Graphics and NVIDIA GeForce GT 750M mit 2 GB GDDR5 running Catalina (no CUDA drives installed). OpenCL is fine and no issues with Local Contrast. So i think the issue might be specific for an older OSX version. @parafin: Did you use a different sdk to build 3.4 than for former revisions?

parafin commented 3 years ago

No, and it can't have any affect on OpenCL I think. OpenCL kernels are compiled at runtime, so either OpenCL drivers changed in the OS, or OpenCL kernels were changed, or it's some interference from something else.

hp48gx commented 3 years ago

some more datapoints:

1) manually editing darktablerc and setting opencl_number_event_handles=100 made the issue harder to reproduce, but it still happens at some point. you have to insist a bit more moving the sliders

2) I noticed that jut switching openCL off is not enough to "fix" a corrupt image. e.g.:

hp48gx commented 3 years ago

3) when an image gets corrupted, just turning off "local contrast" has no effect, however if I force the buffer to be recomputed (e.g. I change the zoom level), the image is shown ok. additionally, if just I change the zoom level, I can see briefly the right picture, which is ruined shortly after.

so it seems that the bug is really in the "local contrast" module.

hp48gx commented 3 years ago

I'm not really in a position to comment on this, but my proposal for a workaround would be: add to preferences a list of "blacklisted" modules, which should not use OpenCL, even if it's enabled.

hp48gx commented 3 years ago

feel free to suggest different configuration parameters

$ grep "opencl" ~/.config/darktable/darktablerc
opencl=FALSE
opencl_async_pixelpipe=false
opencl_avoid_atomics=false
opencl_checksum=1305037760
opencl_device_priority=*/!0,*/*/*/!0,*
opencl_disable_drivers_blacklist=false
opencl_library=
opencl_mandatory_timeout=200
opencl_memory_headroom=400
opencl_memory_requirement=768
opencl_micro_nap=1000
opencl_number_event_handles=100
opencl_scheduling_profile=default
opencl_size_roundup=16
opencl_synch_cache=active module
opencl_use_cpu_devices=false
opencl_use_pinned_memory=false
MStraeten commented 3 years ago

opencl_scheduling_profile should be default - thats ok. then you can force the system to prioritize the dedicated GPU for the full image edit: opencl_device_priority=1,0,/!0,/1,0,/0,/*

but you need to spend some time to read https://darktable-org.github.io/dtdocs/special-topics/opencl/ and play around with the parameters to find a setting thats is ok for you.

You also can run darktable from terminal with /Applications/darktable.app/Contents/MacOS/darktable -d opencl to see which gpu is used:

17,925845 [pixelpipe_process] [full] using device 1
19,657808 [pixelpipe_process] [preview] using device -1
39,677829 [pixelpipe_process] [thumbnail] using device 0

so if it fails for only one of the gpu's then you can force darktable just to use the other ...

hp48gx commented 3 years ago

Before:

0.500944 [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
0.500946 [opencl_init]      0   'Iris Pro'
0.500948 [opencl_init]      1   'GeForce GT 750M'
81.259336 [opencl_update_enabled] enabled flag set to ON
81.259394 [pixelpipe_process] [full] using device 0
81.259394 [pixelpipe_process] [preview] using device 1

so it was using the Nvidia gpu just for the preview.

Apparently excluding the intel iris fixed the issue:

opencl_device_priority=!0,*/!0,*/!0,*/!0,*/!0,*

I now see

92.971018 [pixelpipe_process] [full] using device 1
93.568446 [pixelpipe_process] [preview] using device 1
94.668875 [pixelpipe_process] [full] using device 1
95.847380 [pixelpipe_process] [preview] using device 1
99.622465 [pixelpipe_process] [preview] using device 1
99.650736 [pixelpipe_process] [full] using device -1
100.687369 [pixelpipe_process] [full] using device 1
100.687393 [pixelpipe_process] [preview] using device -1

so, if I read correctly, it's using either the cpu or the nvidia card.

MStraeten commented 3 years ago

You might also reduce the preview size in preferences dialog - this speeds up things if the cpu is used for preview calculation.

hp48gx commented 3 years ago

One more question: from my understanding, the pixelpipe is scheduled on the first free gpu, and on the cpu if there's nothing else. so that's why in the log above sometimes I see full=1 and sometimes full=-1. Is this correct? Can this be changed so that it waits for the gpu to be free?

MStraeten commented 3 years ago

that doesn’t make sense and will cause extreme lagging since the preview is used for histogram, masks etc. waiting will slow down the system. You can force the preview to be calculated with cpu and the full pipe with your gpu: opencl_device_priority=1,!0,*/!1,!0,*/!0,*/!0,*/!0,*

hp48gx commented 3 years ago

Yes, that's expected. But speed is not an issue: my goal was simply to test (empirically) if the gpu is corrupting images. I think, according to the manual, it would be: opencl_device_priority=+1//!0,*/!0,*

I understand that probably a more realistic setting would be (as you mention) =1,!0,*//!0,*/!0,*

I'm not totally sure an empty field means "only the cpu". the manual says "no opencl device"

dim162 commented 3 years ago

Just a tip... To enable the opencl_device_priority setting, you need to use the default profile

MStraeten commented 3 years ago

i had some time to play with a late 2013 Macbook Pro 15 with Intel Iris Pro Graphics and NVIDIA GeForce GT 750M with Catalina.

i tried this by forcing the full pixelpipe to use just one of these. So it seems to be no general opencl driver stuff since the setup of the machine didn't change between both tests. It’s not reproducible on my mbp16 with both GPUs

So my conclusion: something must be different between dt 3.2.1 and dt 3.4 build that affects local contrast processing.

rauno commented 3 years ago

This is also connected to the "filmic rgb"-module. I photos often get corrupted when setting the "hilights clippings" in the "reconstruct"-tab. It also gets fixed by disabling the OpenCL and never happened with 3.2.1.

hp48gx commented 3 years ago

This is also connected to the "filmic rgb"-module.

Just curious: @rauno, is your hardware setup similar to mine? e.g. do you have the same GPUs (Intel+Nvidia)? If so, I'm not too surprised that the bug shows up also in a different context.

rauno commented 3 years ago

@hp48gx Yes, it's similar. Yet the selected "OpenCL scheduling profile" doesn't seem to have any effect on the bug.

rauno commented 3 years ago

Here https://1drv.ms/u/s!AnTq7iL1Zc1Zg_IOxvlT9XPLXKmYAg?e=jhNLaA is one photo that can be used for testing this bug. The corruption occurs when the "threshold"-setting (in "reconstruct"-tab in the "filmic rgb"-module) is set to -1,30 or less. Having any higher value there seem to work ok.

Corruption

And yes, for me the corruption (mostly) affects only the thumbnails.

The "local contrast"-module seems to have no effect here - no mutter if it is activated or not.

github-actions[bot] commented 3 years ago

This issue did not get any activity in the past 30 days and will be closed in 365 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

tbtalbottjr commented 3 years ago

I've been having similar issues with previews crashing in DT 3.2.1 with images using local contrast. I am using a MacBook Pro (15-inch, 2017). Identified GPUs are:

0.808922 [opencl_init] kernel loading time: 0.0131 0.808931 [opencl_init] OpenCL successfully initialized. 0.808933 [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable: 0.808935 [opencl_init] 0 'Intel(R) HD Graphics 630' 0.808937 [opencl_init] 1 'AMD Radeon Pro 560 Compute Engine' 0.808939 [opencl_init] FINALLY: opencl is AVAILABLE on this system.

Based on above comments, I used the darktablerc setting:

opencl_device_priority=1,!0,/!1,!0,/!0,/!0,/!0,*

and that appears to have solved the problem. I get the output:

140.957447 [pixelpipe_process] [thumbnail] using device 1 141.164245 [pixelpipe_process] [thumbnail] using device -1

as opposed to:

212.093241 [pixelpipe_process] [thumbnail] using device 0 212.450576 [pixelpipe_process] [thumbnail] using device 1 213.344834 [default_process_tiling_cl_ptp] use tiling on module 'denoiseprofile' for image with full size 5345 x 3574 213.344858 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 4532 x 3574 and overlap 14 213.344862 [default_process_tiling_cl_ptp] tile (0, 0) with 4532 x 3574 at origin [0, 0] Magick: abort due to signal 6 (SIGABRT) "Abort"... Abort trap: 6

Hopefully, this will work when I update to 3.4.x in a couple of months.

MStraeten commented 3 years ago

Unless someone is able to analyze this with a local build of 3.4.x having a similar system configuration it’s not very likely, thats something will be different ;) the OpenCL kernels are compiled on run time and so they‘re depending on apples implementation of opencl stuff (which is no longer in their scope since they focused on metal)

dschneiderch commented 3 years ago

i had some time to play with a late 2013 Macbook Pro 15 with Intel Iris Pro Graphics and NVIDIA GeForce GT 750M with Catalina.

  • using dt 3.4 local contrast is defect when processed with Intel GPU and fine with NVIDIA GPU
  • using dt 3.2.1 local contrast is fine with both GPU.

i tried this by forcing the full pixelpipe to use just one of these. So it seems to be no general opencl driver stuff since the setup of the machine didn't change between both tests. It’s not reproducible on my mbp16 with both GPUs

So my conclusion: something must be different between dt 3.2.1 and dt 3.4 build that affects local contrast processing.

Just upgraded to dt 3.4 and am seeing this behavior with filmic rgb. i do not have local contrast enabled. the image was fine with 3.2.1. the export is also affected with 3.4. I turned off opencl support and the image goes back to what it should be. i'm on a late-2013 macbookpro w retina

MStraeten commented 3 years ago

Did you check https://github.com/darktable-org/darktable/issues/7544#issuecomment-752671777 to tweak gpu prioritization?

dschneiderch commented 3 years ago

Thanks, I hadn't but changing the device priority to opencl_device_priority=1,!0,/!1,!0,/!0,/!0,/!0,* in the config file doesn't change the opencl priorities at startup. previously I had opencl_device_priority=*/!0,*/*/*/!0,* . the startup always shows the output below. as soon as I activate opencl in the settings I get

Screen Shot 2021-04-25 at 18 42 39
0.217695 [opencl_init] compiling program `negadoctor.cl' ..
0.218074 [opencl_load_program] loaded cached binary program from file '/Users/dominik/.cache/darktable/cached_kernels_for_Iris_12Feb52021214537/negadoctor.cl.bin' MD5: 'ca71adf3ce2f8814bd540260a12b8483'
0.218095 [opencl_load_program] successfully loaded program from '/Applications/darktable.app/Contents/Resources/share/darktable/kernels/negadoctor.cl' MD5: 'ca71adf3ce2f8814bd540260a12b8483'
0.218231 [opencl_build_program] successfully built program
0.218244 [opencl_build_program] BUILD STATUS: 0
0.218249 BUILD LOG:
0.218252
0.218294 [opencl_init] kernel loading time: 0.0216
0.218313 [opencl_init] OpenCL successfully initialized.
0.218316 [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
0.218319 [opencl_init]      0   'Iris'
0.218322 [opencl_init] FINALLY: opencl is AVAILABLE on this system.

...

0.226421 [opencl_priorities] these are your device priorities:
0.226433 [opencl_priorities]        image   preview export  thumbs  preview2
0.226437 [opencl_priorities]        0   -1  0   0   0
0.226441 [opencl_priorities] show if opencl use is mandatory for a given pixelpipe:
0.226444 [opencl_priorities]        image   preview export  thumbs  preview2
0.226447 [opencl_priorities]        0   0   0   0   0
MStraeten commented 3 years ago

Ok the log indicates, you don’t have a dedicated gpu, so prioritization doesn’t help since there’s nothing to prioritize. Unless someone is able to debug and see, what exactly is buggy in the opencl implementation there won’t be a solution by developers - and unfortunately the opencl code is compiled at runtime and dependent of the drivers.

github-actions[bot] commented 2 years ago

This issue did not get any activity in the past 60 days and will be closed in 365 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 300 days with no activity. Please check if the newest release or nightly build has it fixed. Please, create a new issue if the issue is not fixed.