Open tomaz-suller opened 8 months ago
Forgot to add that the log resulted from me not only opening the picture, but also zooming it in until the glitch disappeared at 62% zoom.
To be entirely honest I'm not sure, and I'm not sure about how to check either; if you give any instructions I can follow them.
What I do know, as the logs and my GPU usage according to nvtop
show, is that the GPU is detected and that darktable is using it.
Seems to be precisely what is going on here. The drivers load and the device is detected both by darktable and clinfo
, which is not the case with the Rusticl (since apparently it doesn't provide support for APU at all).
@tomaz-suller would you be able to check with current master? I think the issue should be gone right now, if not please prove a fresh log with -d opencl -d pipe to investigate further.
Just tested, same behaviour still. Just to be sure I didn't mess up during the installation, here's what I installed:
darktable d496f37
Copyright (C) 2012-2024 Johannes Hanika and other contributors.
Compile options:
Bit depth -> 64 bit
Debug -> DISABLED
SSE2 optimizations -> ENABLED
OpenMP -> ENABLED
OpenCL -> ENABLED
Lua -> ENABLED - API version 9.3.0
Colord -> ENABLED
gPhoto2 -> ENABLED
GMIC -> ENABLED - Compressed LUTs are supported
GraphicsMagick -> ENABLED
ImageMagick -> DISABLED
libavif -> ENABLED
libheif -> ENABLED
libjxl -> ENABLED
OpenJPEG -> ENABLED
OpenEXR -> ENABLED
WebP -> ENABLED
And here are the logs.
The output of the first command is just /opt/darktable-test/bin/darktable --version
.
To produce the logs I ran darktable with /opt/darktable-test/bin/darktable --configdir "~/.config/darktable-test" -d pipe -d opencl
, imported 3 NEF files, opened two of them and zoomed in and out.
The device offers only 1gb of ram so there is a huge amount of tiling. Difficult to track down from here.
Can you somehow control the size of dedicated ram, maybe via bios settings?
Could you check with resources=small settings? Also with logs as above ?
Are there any modules you can switch off and the issue goes away?
Can you please confirm that issue goes away while zooming in?
Didn't understand what you mean. I've never tried controlling it, but for sure I can't increase it since I'm running darktable on my laptop, which is the only computer I have, if that's what you're asking; frankly I don't know if it's possible to reduce it.
Still same problem. Logs are here. Just to be 100% sure, this is what you mean by
resources=small
right?
I'm a beginner in darktable, and I have the default install from master
, so I'm a bit clueless about what the "modules" would be. How could I go about disabling them?
Yes, the issue goes away when zooming in, at roughly the same level as before (around 38%)
In previous versions of darktable, it seemed to pick the correct device with ROCm.
At some point over the past several, either ROCm and/or darktable changed and I'd also see these glitches.
I worked around this issue by adding this snippet to /etc/environment
(and logged out and back in) and now darktable works very well again with my 7900 XTX:
HSA_OVERRIDE_GFX_VERSION=11.0.0
(It would be different for different video cards; this is for the 7000 series. For example, I think HSA_OVERRIDE_GFX_VERSION=10.3.0
would be needed for the 6000 series.)
Previously, setting this was unnecessary and darktable picked the correct GPU. In other words, I'm not suggesting this as a solution, only as a temporary workaround and perhaps as a hint as to what the problem might be.
I did disable the iGPU on my AMD Ryzen 9 7950X3D, and rocminfo still shows the CPU as well as my discrete GPU. So I'm guessing that darktable isn't picking the correct GPU or trying to use them all.
After disabling the iGPU, I see this:
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
DMAbuf Support: YES
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 9 7950X3D 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 7950X3D 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5759
BDFID: 0
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 65467276(0x3e6f38c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 65467276(0x3e6f38c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 65467276(0x3e6f38c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1100
Uuid: GPU-16f2a3584821508f
Marketing Name: AMD Radeon RX 7900 XTX
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29772(0x744c)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2371
BDFID: 768
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 550
SDMA engine uCode:: 19
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 25149440(0x17fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 25149440(0x17fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Note the 2 "agents" for ROCm; first is CPU, second is GPU. If I re-enable my iGPU, I'd probably have 3 listed.
@tomaz-suller what you describe is pinpointing to a mem alloc problem. Not sure yet if it's a rusticl bug or dt doing something wrong like overallocating memory. Running dt using the small preference points to a rusticl problem as more likely.
Got it. Just to clarify, I'm using ROCm since rusticl doesn't support my iGPU, but the message is the same of course.
@tomaz-suller the question about modules was about "what dt modules are you yousing? At the right side in darkroom. Could you check with demosaic set to lmmse and check for issue? Could you disable highlights and check? This would be the test on your side where the issue might becoming from. The log shows: opencl is running fine and didn't report an issue...
OK, the AMD driver is notorious for problems :-) Maaybe you can identify the bad module and we can fix it in dt. :-) it might be worth to test if opencl on such a small device is helping for performance at all...
I have this problem as well with integrated Vega 7, with lossy ARWs. And if interpolator is changed, the stripes will have different pattern.
Darktable: 4.6.1
Log from darktable -d pipe -d opencl
:
darktable 4.6.1
Copyright (C) 2012-2024 Johannes Hanika and other contributors.
Compile options:
Bit depth -> 64 bit
Debug -> DISABLED
SSE2 optimizations -> ENABLED
OpenMP -> ENABLED
OpenCL -> ENABLED
Lua -> ENABLED - API version 9.2.0
Colord -> ENABLED
gPhoto2 -> ENABLED
GMIC -> ENABLED - Compressed LUTs are supported
GraphicsMagick -> ENABLED
ImageMagick -> DISABLED
libavif -> ENABLED
libheif -> ENABLED
libjxl -> ENABLED
OpenJPEG -> ENABLED
OpenEXR -> ENABLED
WebP -> ENABLED
See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.
0.2291 [dt_get_sysresource_level] switched to 2 as `large'
0.2291 total mem: 29803MB
0.2291 mipmap cache: 3725MB
0.2292 available mem: 20373MB
0.2292 singlebuff: 465MB
0.2573 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
0.4118 [opencl_init] found 3 platforms
0.4119 [check platform] platform 'rusticl' with key 'clplatform_rusticl' is NOT active
0.4257 [check platform] platform 'Portable Computing Language' with key 'clplatform_portablecomputinglanguage' is NOT active
[opencl_init] found 1 device
[dt_opencl_device_init]
DEVICE: 0: 'gfx900:xnack-'
PLATFORM, VENDOR & ID: AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc., ID=4098
CANONICAL NAME: amdacceleratedparallelprocessinggfx900xnack
DRIVER VERSION: 3590.0 (HSA1.1,LC)
DEVICE VERSION: OpenCL 2.0
DEVICE_TYPE: GPU, dedicated mem
GLOBAL MEM SIZE: 2048 MB
MAX MEM ALLOC: 1741 MB
MAX IMAGE SIZE: 16384 x 16384
MAX WORK GROUP SIZE: 256
MAX WORK ITEM DIMENSIONS: 3
MAX WORK ITEM SIZES: [ 1024 1024 1024 ]
ASYNC PIXELPIPE: NO
PINNED MEMORY TRANSFER: NO
USE HEADROOM: 400Mb
AVOID ATOMICS: NO
MICRO NAP: 250
ROUNDUP WIDTH & HEIGHT 16x16
CHECK EVENT HANDLES: 128
TILING ADVANTAGE: 0.000
DEFAULT DEVICE: NO
KERNEL BUILD DIRECTORY: /usr/share/darktable/kernels
KERNEL DIRECTORY: /home/<redacted>/.cache/darktable/cached_v3_kernels_for_AMDAcceleratedParallelProcessinggfx900xnack_35900HSA11LC
CL COMPILER OPTION: -cl-fast-relaxed-math
CL COMPILER COMMAND: -w -cl-fast-relaxed-math -DAMD=1 -I"/usr/share/darktable/kernels"
KERNEL LOADING TIME: 0.0634 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init] 0 'AMD Accelerated Parallel Processing gfx900:xnack-'
0.9185 [opencl_init] FINALLY: opencl is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: 'default'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 400
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 -1 0 0 -1
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 0 0 0 0
[opencl_synchronization_timeout] synchronization timeout set to 200
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 -1 0 0 -1
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 0 0 0 0
[opencl_synchronization_timeout] synchronization timeout set to 200
This is how it looks:
With LMMSE demosaic, the fit view works.
Got it. Just to clarify, I'm using ROCm since rusticl doesn't support my iGPU, but the message is the same of course.
@tomaz-suller
Did you try out what @garrett proposed in order to make ROCm recognize your iGPU as a supported GPU, setting the HSA_OVERRIDE_GFX_VERSION
environment variable?
Another guess: does the "classic" OpenCL driver work for iGPUs?
E.g. installing it by sudo amdgpu-install --usecase=graphics,opencl --opencl=rocr
I'm interested in this issue as I'm currently planning to buy a laptop (TUXEDO Pulse 14 Gen 4) featuring a Radeon 780M iGPU.
In this forum they claimed that ROCm should work for this iGPU:
Note that we use HSA_OVERRIDE_GFX_VERSION=11.0.0 because the 780m iGPU is gfx1103 (version 11.0.3) which ROCm does not support, but in my experience using the override to tell ROCm to pretend it is gfx1100 seems to work without issue.
This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.
@tomaz-suller did you figure out the issue in the meantime? maybe a new driver release?
A rusticl driver seems to work properly. Try your luck if you are on Linux.
I found one way to get rid of the stripes when using AMD ROCm (at least it worked for me): https://github.com/darktable-org/darktable/issues/17210#issuecomment-2343126616 (summary: enforce pinned memory).
Anybody with the same issue, can you test if this works for you as well?
Cannot reproduce on
Describe the bug
Glitches show up in the bottom right hand corner when images are opened using darkroom with OpenCL no AMD GPU, just as reported in #15589 (which is why I'm skipping some details, including screencast).
As requested in the original issue, I'm attaching the output of running darktable with
darktable -d pipe -d opencl .
. The steps I followed after opening the app to generate the log were:Immediately after opening the image, the glitch in the bottom right hand corner appears, as is visible in the following screenshot:
Important to note that before I installed OpenCL, I didn't have any similar issue. Also, opening images on separate viewers showed no glitch, which made me exclude the possibility of the images themselves being corrupted.
Steps to reproduce
Image glitch immediately shows up on the bottom right corner of the image
Expected behavior
Show the picture with no glitches
Logfile | Screenshot | Screencast
Log after following steps to reproduce with
darktable -d pipe -d opencl .
Some information about my installation from
pacman
:Commit
No response
Where did you obtain darktable from?
distro packaging
darktable version
4.6.1
What OS are you using?
Linux
What is the version of your OS?
EndeavourOS Linux x86_64, Kernel 6.7.6-arch1-1 (all packages up to date)
Describe your system?
No response
Are you using OpenCL GPU in darktable?
Yes
If yes, what is the GPU card and driver?
Radeon 780M 1 GiB (integrated graphics with AMD Ryzen 7 PRO 7840U),
mesa
24.0.1,rocm-opencl-runtime
6.0.0Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip