darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.54k stars 1.13k forks source link

Black screen with some combination of modules and OpenCL #8000

Closed tomaszg7 closed 3 years ago

tomaszg7 commented 3 years ago

Describe the bug

Some combinations of modules cause rendered image to be black. From what I can tell, usually enabling haze removal, local contrast and changing saturation slider in basic adjustments does the trick. It happens only with OpenCL enabled. I was working on RAWs from Canon 6d mk2. It causes both display and exported image to be black.

I think I saw it also in some other combination of modules, but don't remember details.

It doesn't seem to be related to https://github.com/darktable-org/darktable/issues/7488 or https://github.com/darktable-org/darktable/issues/7544 as I didn't use filmic.

To Reproduce

  1. open a fresh RAW file
  2. enable OpenCL
  3. switch to "modern" chromatic adaptation mode (i.e. color calibration module)
  4. enable haze removal, local contrast
  5. change value of saturation in basic adjustments to any value other than 0 (either + or -)
  6. image turns black

Screenshots Screenshot_2021-01-26_11-52-39

Platform:

johnny-bit commented 3 years ago

Works fine on nVidia proprietary drivers for me...

tomaszg7 commented 3 years ago

I'll see if I can reproduce it on nVidia but that will take me a few days.

I just tried to reproduce it with clean install of DT and the bug also is present. I also noticed that changing vibrance slider in basic adjustments makes it go away.

Debug output of -d opencl doesn't show anything when I trigger this bug. Here's part of the full debug output for the processing run when the bug is present (I skipped parts related to SQL as I doubt they are relevant):

18.189742 [undo] record for type 2 (length 2)
18.194199 [add_job_res] 0 | 18.194204 develop process image | queue: 0 | priority: 018.194205 
18.194222 [add_job_res] 1 | 18.194224 develop process preview | queue: 0 | priority: 018.194226 
18.194254 [run_job+] 00 321053654.167644 18.194270 develop process image | queue: 0 | priority: 018.194279 
18.194259 [run_job+] 01 321053654.167649 18.194301 develop process preview | queue: 0 | priority: 018.194310 18.194302 [dev] took 0.000 secs (0.000 CPU) to load the image.
18.194327 [pixelpipe] pipeline state changing for pipe 2, flag 1

18.194337 [pixelpipe] synch top history module `bilat' for pipe 2
18.194361 [params] commit for bilat in pipe 2 with hash 601518379932840596
18.194368 [pixelpipe] pipeline state changing for pipe 4, flag 1
18.194375 [pixelpipe] synch top history module `bilat' for pipe 4
18.194390 [params] commit for bilat in pipe 4 with hash 601518379932840596
18.194395 [pixelpipe_process] [full] using device 0
[memory] before pixelpipe process
18.194412 [pixelpipe_process] [preview] using device -1
[memory] before pixelpipe process
[memory] max address space (vmpeak):   137920400 kB
[memory] cur address space (vmsize):   137535092 kB
[memory] max used memory   (vmhwm ):     1079524 kB
[memory] cur used memory   (vmrss ):      891312 kB
pixelpipe cacheline 0 used -6 by 10217969095731998712 (3884660724964329051)
pixelpipe cacheline 1 used 7 by 18446744073709551615 (18446744073709551615)
[memory] max address space (vmpeak):   137920400 kB
[memory] cur address space (vmsize):   137535092 kB
[memory] max used memory   (vmhwm ):     1079524 kB
[memory] cur used memory   (vmrss ):      891312 kB
pixelpipe cacheline 2 used 6 by 18446744073709551615 (18446744073709551615)
pixelpipe cacheline 3 used 5 by 1919920020093766188 (13496660251053497487)pixelpipe cacheline 0 
pixelpipe cacheline 4 used -4 by 13326790894302083564 (3024950956010155087)
pixelpipe cacheline 5 used 2 by 18446744073709551615 (18446744073709551615)
pixelpipe cacheline 6 used 1 by 2666001810995030712 (7228103408373517339)
pixelpipe cacheline 7 used -8 by 8918846329176418552 (799894523080248667)
cache hit rate so far: 0.059
used 2 by 3030579752522496603 (15235567581761393006)
pixelpipe cacheline 1 used 5 by 17934489101864419279 (12419110733386260474)
pixelpipe cacheline 2 used -4 by 6166992465835771578 (13496660251053497487)
pixelpipe cacheline 3 used 0 by 0 (0)18.194505 [pixelpipe] dt_dev_pixelpipe_process_rec, cache available for pipe 2 with hash 13326790894302083564

pixelpipe cacheline 4 used -7 by 8934755448759610158 (7228103408373517339)
pixelpipe cacheline 5 used 9 by 4670462743806812442 (10459335127001716783)
pixelpipe cacheline 6 used 8 by 7121084142377108671 (9051213736501569162)
pixelpipe cacheline 7 used -2 by 10017327644583579630 (3884660724964329051)
cache hit rate so far: 0.133
18.194542 [pixelpipe] dt_dev_pixelpipe_process_rec, cache available for pipe 4 with hash 6166992465835771578
18.199269 [dev_process_image] pixel pipeline processing took 0.005 secs (0.047 CPU)
18.199328 [run_job-] 00 321053654.172719 18.199335 develop process image | queue: 0 | priority: 018.199339 
18.202444 [dev_pixelpipe] took 0.008 secs (0.053 CPU) processed `display encoding' on CPU, blended on CPU [preview]
image colorspace transform RGB-->RGB took 0.010 secs (0.094 CPU) [final histogram]
final histogram took 0.001 secs (0.009 CPU)
18.214824 [dev_process_preview] pixel pipeline processing took 0.020 secs (0.158 CPU)
18.214863 [run_job-] 01 321053654.188254 18.214867 develop process preview | queue: 0 | priority: 018.214869 

Here's my OpenCL related data from Darktable startup:

0.025929 [opencl_init] opencl related configuration options:
0.025936 [opencl_init] 
0.025938 [opencl_init] opencl: 1
0.025939 [opencl_init] opencl_scheduling_profile: 'default'
0.025941 [opencl_init] opencl_library: ''
0.025943 [opencl_init] opencl_memory_requirement: 768
0.025946 [opencl_init] opencl_memory_headroom: 400
0.025948 [opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
0.025950 [opencl_init] opencl_mandatory_timeout: 200
0.025953 [opencl_init] opencl_size_roundup: 16
0.025955 [opencl_init] opencl_async_pixelpipe: 0
0.025956 [opencl_init] opencl_synch_cache: active module
0.025959 [opencl_init] opencl_number_event_handles: 25
0.025961 [opencl_init] opencl_micro_nap: 1000
0.025963 [opencl_init] opencl_use_pinned_memory: 0
0.025968 [opencl_init] opencl_use_cpu_devices: 0
0.025970 [opencl_init] opencl_avoid_atomics: 0
0.025971 [opencl_init] 
0.026058 [opencl_init] found opencl runtime library 'libOpenCL'
0.026076 [opencl_init] opencl library 'libOpenCL' found on your system and loaded
LoadLib(libhsa-amd-aqlprofile64.so) failed: libhsa-amd-aqlprofile64.so: cannot open shared object file: No such file or directory
0.067907 [opencl_init] found 1 platform
0.067914 [opencl_init] found 1 device
0.067937 [opencl_init] device 0 `gfx803' supports image sizes of 16384 x 16384
0.067940 [opencl_init] device 0 `gfx803' allows GPU memory allocations of up to 3481MB
[opencl_init] device 0: gfx803 
     GLOBAL_MEM_SIZE:          4096MB
     MAX_WORK_GROUP_SIZE:      256
     MAX_WORK_ITEM_DIMENSIONS: 3
     MAX_WORK_ITEM_SIZES:      [ 1024 1024 1024 ]
     DRIVER_VERSION:           3212.0 (HSA1.1,LC)
     DEVICE_VERSION:           OpenCL 1.2 
0.321026 [opencl_init] options for OpenCL compiler: -w  -DAMD=1 -I"/usr/local/stow/darktable/share/darktable/kernels"
johnny-bit commented 3 years ago

I think 2 things:

  1. The bug should show itself with -d nan (probably)
  2. the bug is in basic adjustments opencl code as optimized by amd opencl compiler
tomaszg7 commented 3 years ago

Right, I see

[dev_pixelpipe] module `basic adjustments' outputs NaNs! [full]

(the same shows also for other modules, but that's probably because they happen after basic adj.

johnny-bit commented 3 years ago

So the problem is pis... i mean basic adjustments ;)

Mark-64 commented 3 years ago

Confirmed here with nVidia driver. Funny enough, image turns back on by enabling filmic...

tomaszg7 commented 3 years ago

I also just reproduced it with nVidia driver 455.45.01 and darktable 3.4.0.

johnny-bit commented 3 years ago

So... It's on OpenCL path... Is it just in basic adj combined with local contrast & haze removal (and poossibly filmic?)? Is it with specific type of image when values get pushed into NaNs?

tomaszg7 commented 3 years ago

No, I tried it on a few rather different images, but all were from 6d mk2.

I just attempted to reproduce it on old shots from 7d and initially failed - until I changed basecurve to "canon alternate" and enabled Amaze.

I also reproduced it on sample raw from RPU: https://raw.pixls.us/getfile.php/1625/nice/Canon%20-%20EOS%206D%20Mark%20II%20-%20RAW%20(3:2).CR2 Attaching xmp file with the problem evident for me:

Canon - EOS 6D Mark II - RAW (3 2).cr2.xmp.txt

wombalton commented 3 years ago

I'm runnung a pretty old intel i5-4670S with integrated GPU and with darktable 3.4.0 i also get a black screen with opencl enabled. But there aren't any NaNs.

[...]
9,514750 [pixelpipe_process] [full] using device 0
[dev_pixelpipe] module `raw black/white point' min: (-0,003737) max: (0,396529) [full]
[dev_pixelpipe] module `white balance' min: (-0,007888) max: (0,836702) [full]
[dev_pixelpipe] module `demosaic' min: (0,000000; 0,000000; 0,000000) max: (0,000000; 0,000000; 0,000000) [full]
[dev_pixelpipe] module `denoise (profiled)' min: (0,000001; 0,000000; 0,000000) max: (0,000001; 0,000000; 0,000000) [full]
[dev_pixelpipe] module `exposure' min: (0,000411; 0,000000; 0,000000) max: (0,000411; 0,000411; 0,000411) [full]
[dev_pixelpipe] module `input color profile' min: (0,370888; 0,000000; -0,000498) max: (0,370888; 0,001565; 0,000000) [full]
[dev_pixelpipe] module `color calibration' min: (0,000376; 0,000000; 0,000000) max: (0,000376; 0,000419; 0,000550) [full]
[dev_pixelpipe] module `filmic rgb' min: (0,000152; 0,000000; 0,000000) max: (0,000152; 0,000152; 0,000152) [full]
[dev_pixelpipe] module `output color profile' min: (0,021363; 0,000000; 0,000000) max: (0,021363; 0,014374; 0,026963) [full]
16,515811 [pixelpipe_process] [preview] using device -1
[dev_pixelpipe] module `raw black/white point' min: (0,000801) max: (0,239252) [preview]
[dev_pixelpipe] module `white balance' min: (0,001274) max: (0,338625) [preview]
[dev_pixelpipe] module `demosaic' min: (0,002120; 0,000000; 0,000000) max: (0,338625; 0,239252; 0,211840) [preview]
[dev_pixelpipe] module `denoise (profiled)' min: (0,002336; 0,000000; 0,000000) max: (0,338626; 0,238220; 0,211841) [preview]
[dev_pixelpipe] module `exposure' min: (0,004338; 0,000000; 0,000000) max: (0,569675; 0,400883; 0,356537) [preview]
[dev_pixelpipe] module `input color profile' min: (3,140490; -14,317409; -38,203259) max: (71,432297; 53,728775; 42,805729) [preview]
[dev_pixelpipe] module `color calibration' min: (0,003842; 0,000000; 0,000000) max: (0,524876; 0,420461; 0,471140) [preview]
[dev_pixelpipe] module `filmic rgb' min: (0,000496; 0,000000; 0,000000) max: (0,476702; 0,376504; 0,427114) [preview]
[dev_pixelpipe] module `output color profile' min: (0,000000; 0,000000; 0,000000) max: (0,793103; 0,647944; 0,690059) [preview]

Anyway my OpenCL pipe is really slown, so I will keep it turned off.

github-actions[bot] commented 3 years ago

This issue did not get any activity in the past 30 days and will be closed in 365 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

tomaszg7 commented 3 years ago

It's still there but with #8168 it probably is not relevant anymore. However it could still indicate a bug elsewhere in OpenCL pipeline which might pop out in other modules combinations.

koiuo commented 3 years ago

I wanted to report a new issue before I found this one.

I experience similar issue with OpenCL enabled on linux with amdgpu with darktable 3.4.

The symptoms are:

Please let me know if I should report this in a dedicated issue.

Nilvus commented 3 years ago

Please let me know if I should report this in a dedicated issue.

As you don't precise which modules you use to have such issue, it's hard to answer. If it happens with basic adjustments included in modules, it's not necessary to create a dedicated issue as we can close this one because basic adjustments will be deprecated in next release and replace by new quick access panel feature which will be released to in next release (3.6 in june).

@tomaszg7: I close this issue for this reason, as darktable team will not fix a deprecated module. The new quick access panel should solve that and will be more powerful and clean for image processing.

Nilvus commented 3 years ago

@edio: of course, if your issue is produce without basic adjustments module, it's better to create a new issue, BUT with an always reproducible step by step and all asked information to help understanding why and how it happens. Explain to if it happens with and/or without opencl and which card graphics, os and drivers used is also useful. Thanks.

cosinus90 commented 3 years ago

I face this issue with OpenCL unavailable in my system. Darktable 3.4.1.1 Windows 8.1 Actually, this version produces not fully black image but very dark one, but before updating DarkTable it was black completely. There's the image resulted https://disk.yandex.ru/i/W6PjYh4IhM6Tcw source one https://disk.yandex.ru/i/jj9dwtbFms1A5Q And this is xmp file https://disk.yandex.ru/d/m6t5vTtkupvwxg

johnny-bit commented 3 years ago

I can't replicate the problem on current master on either opencl or cpu-codepaths

Are you sure you posted the correct XMP? the file you say has problem is named IMG_20210210_171154.jpg while xmp is IMG_20210210_171140.jpg.xmp (so for different file)

cosinus90 commented 3 years ago

Sorry, I took next xmp file. Correct one is here https://disk.yandex.ru/d/cShDx1Ejmv2SrQ

johnny-bit commented 3 years ago

With and without opencl on 3.5.0+2512~g6751588ef I see this:

Zrzut ekranu z 2021-06-10 14-35-54

cosinus90 commented 3 years ago

Yes, inside DarkTable everything's fine, but try to export it.

johnny-bit commented 3 years ago

Hmmm...

OpenCL off: IMG_20210210_171154

OpenCL on: IMG_20210210_171154_01

the only thing "off" i notice is that on commandline darktable complains:

In openCL on:

[dt_ioppr_transform_image_colorspace_cl] invalid conversion from 2 to 0
[_transform_matrix] invalid conversion from 2 to 0
[dt_ioppr_transform_image_colorspace] invalid conversion from 2 to 0

with opencl off:

[_transform_matrix] invalid conversion from 2 to 0
[dt_ioppr_transform_image_colorspace] invalid conversion from 2 to 0
cosinus90 commented 3 years ago

I figured out that there was a custom preset activated in export section of the sidebar. Here I reset it and got black image https://disk.yandex.ru/d/zGKAApeoCAiRlA jpeg 8 bit all export parameters are by default.

Nilvus commented 3 years ago

I face this issue with OpenCL unavailable in my system. Darktable 3.4.1.1 Windows 8.1 Actually, this version produces not fully black image but very dark one, but before updating DarkTable it was black completely.

@cosinus90: did you try on actual master? You talk about 3.4.1.1 here and @johnny-bit tested this on actual master (3.5) so future 3.6 and can't reproduce. If you have the possibility to try if your reproduce or not in actual master, it would be better I think. Or wait about 3 weeks to test with 3.6 which should be released on early july.