darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.8k stars 1.14k forks source link

Longitudinal vertical stripes appear in the right half of the exported images. #16984

Open RafaelLinux opened 5 months ago

RafaelLinux commented 5 months ago

Describe the bug

After making changes to an image with DT and proceeding to export, the exported image has vertical stripes only on the right side of the image, regardless of the output format (either uncompressed TIFF or uncompressed JXL).

In order to test what could be the problem, I deactivated modules in that image, and I discovered that by deactivating the "Contrast Equalizer" module, the problem disappears.

Steps to reproduce

  1. Go to "Export"
  2. Select any export format
  3. Export image

Expected behavior

Export image as in screen, without vertical stripes

Logfile | Screenshot | Screencast

%T_%Y%M%D_%H%m%S

Thumbnails

Commit

No response

Where did you obtain darktable from?

downloaded from www.darktable.org

darktable version

4.6.1

What OS are you using?

Linux

What is the version of your OS?

openSUSE Tumbleweed

Describe your system?

Operating System: openSUSE Tumbleweed 20240611 KDE Plasma Version: 6.0.5 KDE Frameworks Version: 6.3.0 Qt Version: 6.7.1 Kernel Version: 6.9.3-1-default (64-bit) Graphics Platform: Wayland Processors: 24 × AMD Ryzen 9 3900X 12-Core Processor Memory: 31.3 GiB of RAM Graphics Processor: AMD Radeon RX 7600 Product Name: B550M Phantom Gaming 4

Are you using OpenCL GPU in darktable?

I dont know

If yes, what is the GPU card and driver?

No response

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

<?xml version="1.0" encoding="UTF-8"?>

RafaelLinux commented 5 months ago

The last time I did a test of these as you suggested, I could no longer use the version of DT I was using because the development version I tested made changes to the database, and it was no longer compatible with my version of DT.

So, as always, I have no problem testing the development version, as long as you can assure me that I will be able to continue using the database with my current version of DT.

kmilos commented 5 months ago

So, as always, I have no problem testing the development version, as long as you can assure me that I will be able to continue using the database with my current version of DT.

You can start the development version w/ the separate .config folder and separate database (and disable sidecars as well).

RafaelLinux commented 5 months ago

I was unaware of this possibility. Thank you very much for providing the link on how to do it. I will try it tonight and let you know the result.

I will also attach the output file you asked for.

Thanks

kmilos commented 5 months ago

I was unaware of this possibility. Thank you very much for providing the link on how to do it.

Always good to have more testers, hope you'll keep checking out future development versions as well 😉

RafaelLinux commented 5 months ago

In that link talk about a compiled version, that should be in /opt/darktable-test/bin/darktable --configdir "~/.config/darktable-test" However, I'll test the AppImage version so, will work same parameters?

jlfrucot commented 4 months ago

Hello, Using DT 4.8.0 appimage on ubuntu 24.04, i'm experimenting same issue with JEPG export but only few images are affected. How can i help ?

RafaelLinux commented 4 months ago

jif

Hello, Using DT 4.8.0 appimage on ubuntu 24.04, i'm experimenting same issue with JEPG export but only few images are affected. How can i help ?

As I opened this thread and was waiting to have an "official" from my distro, now I can say that v4.8.0 has not fixed issue.

jlfrucot, could you try to disable filters till you find if it's same filter "Contrast Equalizer" or maybe some other one?

jlfrucot commented 4 months ago

Multiple export with same image produce same issue. Testing with that image in Ansel works well

jenshannoschwalm commented 4 months ago

There was an issue in 4.8 for jpeg (in general all 8bit) exports that has been fixed in 4.8.1

You might try to compile yourself to check ...

jenshannoschwalm commented 4 months ago

Ok, this seems to be the same issue related to tiled scaling at the end of the processing pipe.

RafaelLinux commented 4 months ago

I don't know if when you mention rescaling you are referring to this case or to the referenced case when exporting to JPG, that's why I want to clarify that in this case, I haven't applied any rescaling module. The export is at the original RAW size.

RafaelLinux commented 4 months ago

When I opened this thread about the problem, I incorporated all the information I thought necessary. I have not been asked for more information.

But I was right to reply to your comment above because from what you say, something incorrect has been inferred. The word "upscayl" is the name of the original file, which, before I started processing it with Darktable, is indeed rescaled to a larger size.

In the information I attached from the beginning and in the screenshot, you can clearly see ALL the modules I have applied (none referred to scaling) and I have also specified which module is the one that seems to produce the problem. I don't know if I have not been understood, but I can explain it in another way if this is the case.

I repeat, it is only from the application of the "contrast equaliser" module that the export generates the problem. If more information is required, please let me know.

jenshannoschwalm commented 4 months ago

I have understood that you can only reproduce with that specific module.

There is another issue that would explain the garbled output. and that issue has been understood and will be fixed first as it's a major issue requiring a fast fix.

About your issue, I can't reproduce yet. What is definitely required from you is a log file using dt 4.8 with option '-d pipe -d opencl'. That would help to get an idea about what's happening. btw there have been no other issues here or on pixls for that module afaik.

RafaelLinux commented 4 months ago

In 10 hours I will be able to attach the output with this command.

In any case, it is important to keep in mind that not being able to reproduce a problem or that nobody else has reported it, does not mean that it does not exist. I say this from experience. It happened to me very recently with another application that, after an update, started to crash. I identified that it was related to non-ASCII characters in the path, but after a week of the developers not being able to reproduce the problem (despite making it clear what it was) they closed the thread. Two days later, a Portuguese user reported the same problem, for the same reason. Finally, they managed to solve it (I don't remember what changes in the code produced the problem, nor are they relevant).

And I tell this because I have only been able to discover that it is from the application of this module that the problem occurs. From there, it may be the parameters used in the module (exposed in my presentation) or a combination of modules in a particular order (I attached the XMP of the trace) but the problem persists. I will do more tests to see how far I can go to find out what is going on. I think I can clone the image with the operations performed on it and experiment from there without risk of losing subsequent changes to that module, which I want to keep.

Thank you

jenshannoschwalm commented 4 months ago

You really don't have to beg for my understanding :-) 1. Not having earlier issues just means i did not get hands on any log file. 2. Missing your darktable opencl settings (can read that from your log if provided) 3. No idea where exactly your system fails 4. No idea if your OpenCl is ready and good or if you are using a bad driver (there have been a number of issues with amd rusticl drivers and while working on dt we have found at least two bugs there ...)

So please don't insist on "it's that module" .. let's see what the logs show to investigate/proceed further.

Also would need to know - is this only on OpenCL or also on GPU ?

EDIT: could you provide the complete original xmp file you have been using?

RafaelLinux commented 4 months ago

Thanks for giving me an interesting hint, which I've been digging into for several hours. You mentioned that it could be related to "bad drivers", so I decided to do some more tests.

First, I checked that I had (I don't know why) all drivers enabled in DT, and only ROCm should be enabled. Then, I checked that I couldn't use openCL, for reasons unknown to me. After some research, I managed to get openCL working on my operating system (clinfo.txt) and darktable-cltest finally was able to use it (dt_cltest_output.txt).

Then, I enabled its use in DT and restarted it. I did the export again, and got the same wrong result when exporting. Finally, I tried disabling ROCm, restarting DT and exporting again. The error disappears, just like it disappeared when I disabled the module I mentioned.

And the output using darktable -d pipe -d opencl you suggested me to get more info is this one dt_output.txt

I should add that I had hoped that in mathematical export processes in complex formats such as JXL, more intensive use would be made of the GPU with openCL, but it is barely used in 10% of the process. It is the CPU that is practically at 100% 80% of the time in that export. I don't know if this is normal or depends on other factors.

Any additional information needed, please ask me for it.

parafin commented 4 months ago

I should add that I had hoped that in mathematical export processes in complex formats such as JXL, more intensive use would be made of the GPU with openCL, but it is barely used in 10% of the process.

darktable doesn’t reimplement any output formats that use compression, but re-uses existing implementations (because if one would have wanted to implement JXL compression on GPU, it would have made more sense to create a separate library for that, and then use it in darktable). Do you know of any open-source (or even any at all) JXL encoder libraries that run on GPU?

RafaelLinux commented 4 months ago

darktable doesn’t reimplement any output formats that use compression, but re-uses existing implementations (because if one would have wanted to implement JXL compression on GPU, it would have made more sense to create a separate library for that, and then use it in darktable). Do you know of any open-source (or even any at all) JXL encoder libraries that run on GPU?

Well, I have taken the trouble to check the current status of this possibility with the official libraries and it only exists as a "feature request", but nothing else, and I really don't understand why, since JXL is one of the formats that requires more calculations to achieve compression. We will surely have to wait several years ...

jenshannoschwalm commented 4 months ago

The provided log shows two cl platforms which it probably not correct and might the reason for the issue. don't know how you did that but that is very likely an installation problem of the cl system.

RafaelLinux commented 4 months ago

I don't know if you mean that it shows "OpenCL 2.5" and "OpenCL 3.0" in two different places in the output of "clinfo". However, it is true that other applications like Blender or "Davinci Resolve", are making use of OpenCL on my computer without apparent problems.

jenshannoschwalm commented 4 months ago

It shows two drivers for one card in the darktable log. That's clearly not ok.

da-phil commented 3 months ago

@RafaelLinux I reported a similar issue here https://github.com/darktable-org/darktable/issues/17239 In my case the issue was solved after updating to the latest AMDGPU driver (from 19/06/2024). Did you try that as well? Can you also specify which AMDGPU driver you're using? Did you use the amdgpu-install script to set up your AMD driver setup?

RafaelLinux commented 3 months ago

Sorry it took me so long to respond. I swear I started writing the reply many days ago and something happened that I didn't get around to sending it. My version of the driver (for SLE) is not the official one for my distribution (Tumbleweed), but it works in all tests for OpenCL (like LuxMark), also with Blender and DaVinci Resolve.

rocminfo output is:

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 9 3900X 12-Core Processor
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 9 3900X 12-Core Processor
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3800                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            24                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Memory Properties:       
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    32784212(0x1f43f54) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    32784212(0x1f43f54) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    32784212(0x1f43f54) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1102                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon RX 7600                 
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      32(0x20) KB                        
    L2:                      2048(0x800) KB                     
  Chip ID:                 29824(0x7480)                      
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2356                               
  BDFID:                   1792                               
  Internal Node ID:        1                                  
  Compute Unit:            32                                 
  SIMDs per CU:            2                                  
  Shader Engines:          2                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Memory Properties:       
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 262                                
  SDMA engine uCode::      21                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    8372224(0x7fc000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    8372224(0x7fc000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1102         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***             
github-actions[bot] commented 1 month ago

This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

RafaelLinux commented 1 month ago

Even after upgrade to a most recent ROCm version, same issue.

>clinfo -a | grep -i 'name\|vendor\|version\|profile'
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3625.0)
  Platform Profile                                FULL_PROFILE
  Platform Name                                   AMD Accelerated Parallel Processing
  Device Name                                     gfx1102
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 
  Driver Version                                  3625.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Board Name (AMD)                         AMD Radeon RX 7600
  Device Profile                                  FULL_PROFILE
    IL version                                    (n/a)
    SPIR versions                                 (n/a)
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1102
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1102
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1102
  ICD loader Name                                 Khronos OpenCL ICD Loader
  ICD loader Vendor                               Khronos Group
  ICD loader Version                              3.0.6
  ICD loader Profile                              OpenCL 3.0
da-phil commented 3 weeks ago

I'm using a slightly more outdated AMD GPU (RX 5700 XT) and have no (more) problems with stripes in images. It once happened when I was using an outdated AMD GPU driver as mentioned above already (https://github.com/darktable-org/darktable/issues/17239).

Here is my clinfo output:

  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3625.0)
  Platform Profile                                FULL_PROFILE
  Platform Name                                   AMD Accelerated Parallel Processing
  Device Name                                     gfx1010:xnack-
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 
  Driver Version                                  3625.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Board Name (AMD)                         AMD Radeon RX 5700 XT
  Device Profile                                  FULL_PROFILE
    IL version                                    (n/a)
    SPIR versions                                 (n/a)
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1010:xnack-
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1010:xnack-
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1010:xnack-
  ICD loader Name                                 Khronos OpenCL ICD Loader
  ICD loader Vendor                               Khronos Group
  ICD loader Version                              3.0.6
  ICD loader Profile                              OpenCL 3.0

which looks a lot like yours. I assume you're also using the latest AMD driver version 24.20.3? I'm also using Rocm, although it's not supposed to work for my GPU according to the supported GPU list :sweat_smile:

Here is my rocminfo: rocminfo.txt

You omitted the first part, "mROCk module version" and "HSA System Attributes".

Did you make sure that darktable is only allowed to use the Rocm opencl engine instead of rusticl and others?

RafaelLinux commented 3 weeks ago

This is my rocminfo output, really similar to yours: borrar-rocminfo.txt

Not sure about my AMD Driver

~> zypper se -si AMD
Cargando datos del repositorio...
Leyendo los paquetes instalados...

S  | Name                   | Type    | Version                 | Arch   | Repository
---+------------------------+---------+-------------------------+--------+----------------------------
i  | amdgpu-core            | paquete | 1:6.2.60200-2009582     | noarch | (Paquetes del sistema)
i+ | amdgpu-install         | paquete | 6.1.60103-1787201       | noarch | RPMs
i  | kernel-firmware-amdgpu | paquete | 20241001-1.1            | noarch | (Paquetes del sistema)
i  | libamd3                | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libcamd3               | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libccolamd3            | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libcolamd3             | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libdrm-amdgpu          | paquete | 1:2.4.120.60200-2009582 | x86_64 | (Paquetes del sistema)
i  | libdrm-amdgpu-common   | paquete | 1.0.0.60200-2009582     | noarch | (Paquetes del sistema)
i  | libdrm-amdgpu-devel    | paquete | 1:2.4.120.60200-2009582 | x86_64 | (Paquetes del sistema)
i  | libdrm_amdgpu1         | paquete | 2.4.123-1.1             | x86_64 | Repositorio principal (OSS)
i  | libdrm_amdgpu1-32bit   | paquete | 2.4.123-1.1             | x86_64 | Repositorio principal (OSS)
i+ | libvdpau-amdgpu        | paquete | 6.2-2009582             | x86_64 | (Paquetes del sistema)
i  | ucode-amd              | paquete | 20241001-1.1            | noarch | (Paquetes del sistema)

And you can see I have only enabled ROCm in Darktable. imagen

And I'm using SLES drivers in Tumbleweed. And, as I commented, Davinci Resolve works w/o issues with this config.

jenshannoschwalm commented 2 weeks ago

Not sure about your config still. From your earlier log:

     0.0838 [dt_get_sysresource_level] switched to 2 as `large'
     0.0838   total mem:       32015MB
     0.0838   mipmap cache:    4001MB
     0.0838   available mem:   21885MB
     0.0838   singlebuff:      500MB
     0.0914 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL'
     0.0915 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL.so'
     0.0917 [opencl_init] opencl library 'libOpenCL.so.1' found on your system and loaded, preference 'default path'
     0.1248 [opencl_init] found 2 platforms
     0.1248 [opencl_init] possibly a multiple platform problem for `AMD Accelerated Parallel Processing'
[opencl_init] found 2 devices

[dt_opencl_device_init]
   DEVICE:                   0: 'gfx1102'
   CONF KEY:                 cldevice_v5_amdacceleratedparallelprocessinggfx1102
   PLATFORM, VENDOR & ID:    AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc., ID=4098
   CANONICAL NAME:           amdacceleratedparallelprocessinggfx1102
   DRIVER VERSION:           3614.0 (HSA1.1,LC)
   DEVICE VERSION:           OpenCL 2.0 
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          8176 MB
   MAX MEM ALLOC:            6950 MB
   MAX IMAGE SIZE:           16384 x 16384
   MAX WORK GROUP SIZE:      256
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 1024 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   USE HEADROOM:             600Mb
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/myuser/.cache/darktable/cached_v3_kernels_for_AMDAcceleratedParallelProcessinggfx1102_36140HSA11LC
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math  -DAMD=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       0.0294 sec

[dt_opencl_device_init]
   DEVICE:                   1: 'gfx1102'
   CONF KEY:                 cldevice_v5_amdacceleratedparallelprocessinggfx1102
   PLATFORM, VENDOR & ID:    AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc., ID=4098
   CANONICAL NAME:           amdacceleratedparallelprocessinggfx1102
   DRIVER VERSION:           3614.0 (HSA1.1,LC)
   DEVICE VERSION:           OpenCL 2.0 
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          8176 MB
   MAX MEM ALLOC:            6950 MB
   MAX IMAGE SIZE:           16384 x 16384
   MAX WORK GROUP SIZE:      256
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 1024 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   USE HEADROOM:             600Mb
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/myuser/.cache/darktable/cached_v3_kernels_for_AMDAcceleratedParallelProcessinggfx1102_36140HSA11LC
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math  -DAMD=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       0.0216 sec

This part looks clearly wrong

     0.1248 [opencl_init] found 2 platforms
     0.1248 [opencl_init] possibly a multiple platform problem for `AMD Accelerated Parallel Processing'

Also your clinfo log (you only provided the first part) should show only one platform.

Don't know about davinci and how that checks for cl devices. maybe they do better - and dt might do better - but having multiple platforms (likely you have multiple driver versions installed) for a single graphics card is not correct afaik.

da-phil commented 2 weeks ago

This is my rocminfo output, really similar to yours: borrar-rocminfo.txt

Not sure about my AMD Driver

~> zypper se -si AMD
Cargando datos del repositorio...
Leyendo los paquetes instalados...

S  | Name                   | Type    | Version                 | Arch   | Repository
---+------------------------+---------+-------------------------+--------+----------------------------
i  | amdgpu-core            | paquete | 1:6.2.60200-2009582     | noarch | (Paquetes del sistema)
i+ | amdgpu-install         | paquete | 6.1.60103-1787201       | noarch | RPMs
i  | kernel-firmware-amdgpu | paquete | 20241001-1.1            | noarch | (Paquetes del sistema)
i  | libamd3                | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libcamd3               | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libccolamd3            | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libcolamd3             | paquete | 7.5.1-50.3              | x86_64 | Repositorio principal (OSS)
i  | libdrm-amdgpu          | paquete | 1:2.4.120.60200-2009582 | x86_64 | (Paquetes del sistema)
i  | libdrm-amdgpu-common   | paquete | 1.0.0.60200-2009582     | noarch | (Paquetes del sistema)
i  | libdrm-amdgpu-devel    | paquete | 1:2.4.120.60200-2009582 | x86_64 | (Paquetes del sistema)
i  | libdrm_amdgpu1         | paquete | 2.4.123-1.1             | x86_64 | Repositorio principal (OSS)
i  | libdrm_amdgpu1-32bit   | paquete | 2.4.123-1.1             | x86_64 | Repositorio principal (OSS)
i+ | libvdpau-amdgpu        | paquete | 6.2-2009582             | x86_64 | (Paquetes del sistema)
i  | ucode-amd              | paquete | 20241001-1.1            | noarch | (Paquetes del sistema)

And you can see I have only enabled ROCm in Darktable. imagen

And I'm using SLES drivers in Tumbleweed. And, as I commented, Davinci Resolve works w/o issues with this config.

In my case I'm using amdgpu-install version 6.2.60203, but this shouldn't be the issue as you reported flawless operation with other tools utilizing OpenCL, such das davinci resolve.

When I execute darktable-cltest, I get the following output with only one OpenCL GPU device: darktable-cltest.txt

I find it strange that darktable recognizes two devices in your output, although clinfo and rocminfo both did not report two devices. Did you enable several OpenCL engines in darktable when you posted the darktable-cltest output?

jenshannoschwalm commented 2 weeks ago

I find it strange that darktable recognizes two devices in your output

It's not strange as his log shows there are two platforms as mentioned above. So the one hardware device is initialized on two platforms resulting in two available devices which is clearly wrong and leading to issues.

This almost always is the result of a bad installation or having used "dirty" tricks to make "it running". We have seen this before especially on "arch" based distributions ...

So what to do?

  1. Proper configure of opencl would be the best option before thinking of a defect driver
  2. You might also simply disable the second found device, that would probably work. See https://darktable-org.github.io/dtdocs/en/special-topics/mem-performance/ about device specific settings.