darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.48k stars 1.12k forks source link

openCL, rusticl: Square grid artifacts show up on raw photos from phones #16717

Closed garrett closed 3 months ago

garrett commented 4 months ago

Describe the bug

I decided to try Rusticl with darktable, as ROCm OpenCL is not currently working on Fedora and I wanted to still use GPU acceleration with my AMD 7900 XTX.

Everything looked fine with Fuji and Ricoh files. When I looked at raw files from my Google Pixel 6 Pro, I noticed a tile grid that shows up on zoomed out previews, at 1:1, and on exports too. Further investigation through decades of raw files show that it also affects raw files from my old OnePlus 6T. Dedicated cameras, inclunding my own older ones through the years and play raws from discuss.pixls.us all seem to be fine, however.

Steps to reproduce

  1. Enable Rusticl.
  2. Load a raw file from a phone

Expected behavior

darktable should render the photo the same whether Rusticl OpenCL is on or off

Logfile | Screenshot | Screencast

Affected image (a snapshot I used to share a food pic with family; it has a solid color plate which shows the bug well):

image

It should look like this (this is a screenshot of the same photo with OpenCL off in darktable):

image

Screenshot crop of the affected image in darktable:

image

It affects exports too (this was exported at 1000px max wide and lower quality JPEG settings, but it affects higher quality and 1:1 exports too):

e6b91950f8883f3ef2b26f7a7e50b15430f39c3a

Commit

No response

Where did you obtain darktable from?

distro packaging

darktable version

darktable-4.6.1-5.fc40.x86_64

What OS are you using?

Linux

What is the version of your OS?

Fedora 40, Silverblue 40.20240430.1

Describe your system?

GPU: AMD 7900 XTX CPU: AMD Ryzen 9 7950X3D × 32 64 GB RAM X11 GNOME

Are you using OpenCL GPU in darktable?

Yes

If yes, what is the GPU card and driver?

Rusticl with AMD 7900 XTX

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

Dedicated camera files appear to be unaffected (Fuji, Ricoh, Canon, Nikon, Leica, Olympus). This seems to only affect phone cameras so far (Google Pixel 6 Pro, OnePlus 6T with a sideloaded Google Camera app — but not the default OnePlus camera app), in my testing.

So this might be specifically related to raws from the Google Camera app with Rusticl OpenCL. I've edited photos from this phone and camera app before with darktable and ROCm turned on using the same exact hardware and didn't see this problem.

Here's the raw file as featured in the screenshot above, compressed in a ZIP so GitHub would accept it: PXL_20240421_095931520.RAW-02.ORIGINAL.zip

jenshannoschwalm commented 4 months ago

@karolherbst another one for you. it's in rawprepare module doing the 1f_gainmaps kernel.

karolherbst commented 4 months ago

what's the mesa version? Maybe it doesn't have the fix? Does it even have the workaround you added to darktable?

garrett commented 4 months ago

The version of Mesa installed is 24.0.6.

All specific Mesa packages in Fedora that are installed on my system are:

mesa-filesystem-24.0.6-2.fc40.x86_64
mesa-libxatracker-24.0.6-2.fc40.x86_64
mesa-va-drivers-24.0.6-2.fc40.x86_64
mesa-vulkan-drivers-24.0.6-2.fc40.x86_64
mesa-libglapi-24.0.6-2.fc40.x86_64
mesa-dri-drivers-24.0.6-2.fc40.x86_64
mesa-libgbm-24.0.6-2.fc40.x86_64
mesa-libEGL-24.0.6-2.fc40.x86_64
mesa-libGL-24.0.6-2.fc40.x86_64
mesa-libOpenCL-24.0.6-2.fc40.x86_64
jenshannoschwalm commented 4 months ago

@karolherbst I think it's something different here. Got some debugging pfm file from the rawprepare module, the issue is only evident if the special kernel handling the Gainmaps is in use. See 'data/kernels/basic.cl'

karolherbst commented 4 months ago

okay, yeah and the fix I've written for the last issue is part of 24.0.6, just wanted to make sure it's a new one. Will take a look next week or so, because technically I'm off this week.

karolherbst commented 3 months ago

okay, I can reproduce the issue with rusticl on my AMD card, but not on my Intel one. Hopefully I'll be able to figure out quickly what's going on here.

jenshannoschwalm commented 3 months ago

I suspect the interpolator, in dt we only use that here in this kernel.

karolherbst commented 3 months ago

looks like the output of rawprepare_1f_gainmap is the first thing different and significantly enough to explain the wrong output. Will need to dig deeper on what's happening there.

karolherbst commented 3 months ago

okay figured it out. There seems to be something going wrong with samplers and on radeonsi specifically it ends up using the sampleri instead, so it ends up doing nearest filtering instead of linear.

I think I already know why it happens, but will have to play around a bit and might come up with a fix tomorrow or so

karolherbst commented 3 months ago

Actually it was something else than I thought and kinda simple. In any case, upstream MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29230 and marked as a backport candidate.

jenshannoschwalm commented 3 months ago

Closing this as fixed upstream.