darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.07k stars 1.11k forks source link

Image preview glitched in lighttable culling layout at 100% #15589

Closed davidak closed 6 months ago

davidak commented 8 months ago

Describe the bug

See demo: https://www.youtube.com/watch?v=oKUt91cf7dI

Screenshot from 2023-11-06 20-26-54

I don't use the latest version of darktable and my OS. I can try if an update fixes the issue. You don't have to look into it now, but maybe it is a known issue. For now, this report is for reference.

Steps to reproduce

  1. I have 2 TIFF images with 95MP (12032x8024) in 32-bit float which are 1.5GB each opened in darktable with the same edit applied.
  2. I select both in lighttable mode.
  3. Enter culling layout and zoom in 100%.
  4. In the left image the entire lower right quarter is only displayed as a horizontal stripes

I can zoom out a bit without the glitch disappears, but when i zoom out more, it renders correctly. As if the zoom level rendering is cached and corrupted.

It is reproducible when i close and open the program again.


When i delete the cache at ~/.cache/darktable and open darktable again, the image have to be rendered again. The glitch is gone, but now the other image is just black for the most part.

At the time of error, dmesg shows this:

[Mon Nov 6 21:14:05 2023] amdgpu: init_user_pages: Failed to get user pages: -14

The 30% zoom view is fine, but after 32% it's corrupted.

Demo: https://www.youtube.com/watch?v=TgbdpDnU_ic

That's always reproducible.

Expected behavior

no glitch in displayed images

Logfile | Screenshot | Screencast

No helpful logs from darktable:

[davidak@gaming:~]$ darktable -d verbose
wait time 0.367537s
try- wait time 0.249821s
wait time 0.476149s
try+ wait time 0.346600s mode r 

The amdgpu driver seem to be crashed when the glitched render was created. No new occurances when opening darktable and zooming in again.

dmesg -T:

[Mon Nov  6 20:11:15 2023] Invalid BO not marked invalid
[Mon Nov  6 20:11:15 2023] WARNING: CPU: 4 PID: 725464 at drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2533 amdgpu_amdkfd_restore_userptr_worker+0x520/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023] Modules linked in: snd_seq_dummy snd_seq tls rfcomm cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel btmtk bluetooth ecdh_generic ecc exfat uas usb_storage qrtr nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter overlay af_packet xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat rfkill msr snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils soundwire_bus snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_realtek snd_soc_core snd_hda_codec_generic intel_rapl_msr intel_rapl_common ledtrig_audio snd_compress intel_tcc_cooling snd_hda_codec_hdmi ac97_bus snd_usb_audio snd_pcm_dmaengine snd_hda_intel x86_pkg_temp_thermal intel_powerclamp coretemp snd_intel_dspcfg crc32_pclmul polyval_clmulni snd_intel_sdw_acpi polyval_generic ip6_tables gf128mul
[Mon Nov  6 20:11:15 2023]  snd_hda_codec snd_usbmidi_lib ghash_clmulni_intel sha512_ssse3 sha512_generic snd_rawmidi snd_seq_device aesni_intel snd_hda_core mc libaes cmdlinepart crypto_simd snd_hwdep spi_nor cryptd r8169 iTCO_wdt snd_pcm intel_pmc_bxt watchdog mei_hdcp mei_pxp xt_conntrack rapl ee1004 mtd mfd_core realtek snd_timer mdio_devres nf_conntrack mei_me intel_cstate input_leds evdev snd mousedev led_class joydev nf_defrag_ipv6 i2c_i801 spi_intel_pci mac_hid nf_defrag_ipv4 libphy intel_uncore soundcore gigabyte_wmi i2c_smbus intel_wmi_thunderbolt spi_intel wmi_bmof mei intel_pch_thermal edac_core thermal fan ip6t_rpfilter ipt_rpfilter tiny_power_button pinctrl_cannonlake intel_pmc_core acpi_pad button xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nf_tables libcrc32c nfnetlink sch_fq_codel uinput ctr atkbd libps2 serio vivaldi_fmap loop cpufreq_powersave tap macvlan veth bridge stp llc tun kvm_intel kvm irqbypass fuse pstore configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 hid_generic
[Mon Nov  6 20:11:15 2023]  usbhid hid sd_mod ahci libahci xhci_pci xhci_pci_renesas xhci_hcd libata nvme usbcore nvme_core scsi_mod t10_pi crc32c_intel crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common usb_common scsi_common rtc_cmos dm_mod dax amdgpu i2c_algo_bit drm_ttm_helper ttm agpgart video wmi iommu_v2 drm_buddy gpu_sched drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt drm i2c_core backlight
[Mon Nov  6 20:11:15 2023] CPU: 4 PID: 725464 Comm: kworker/4:1 Tainted: G        W          6.3.8 #1-NixOS
[Mon Nov  6 20:11:15 2023] Hardware name: Gigabyte Technology Co., Ltd. Z390 UD/Z390 UD, BIOS F10 11/05/2021
[Mon Nov  6 20:11:15 2023] Workqueue: events amdgpu_amdkfd_restore_userptr_worker [amdgpu]
[Mon Nov  6 20:11:15 2023] RIP: 0010:amdgpu_amdkfd_restore_userptr_worker+0x520/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023] Code: 00 00 00 00 84 c0 0f 85 74 ff ff ff 8b 95 9c 00 00 00 c7 44 24 08 f5 ff ff ff 85 d2 75 b2 48 c7 c7 1b c9 aa c0 e8 a0 b5 0c eb <0f> 0b eb a2 44 8b 74 24 08 48 8b 6c 24 10 45 85 f6 75 5e 41 c7 44
[Mon Nov  6 20:11:15 2023] RSP: 0000:ffffae6e0445fdf8 EFLAGS: 00010282
[Mon Nov  6 20:11:15 2023] RAX: 0000000000000000 RBX: ffff9d0c892b0fe8 RCX: 0000000000000027
[Mon Nov  6 20:11:15 2023] RDX: ffff9d121db1d5c8 RSI: 0000000000000001 RDI: ffff9d121db1d5c0
[Mon Nov  6 20:11:15 2023] RBP: ffff9d10bad37800 R08: 0000000000000000 R09: 0000000100006c9b
[Mon Nov  6 20:11:15 2023] R10: ffffae6e0445fca0 R11: ffffffffacf1b2c8 R12: ffff9d0c892b1098
[Mon Nov  6 20:11:15 2023] R13: ffff9d0c892b1020 R14: ffff9d10bad37848 R15: ffff9d10bae8a800
[Mon Nov  6 20:11:15 2023] FS:  0000000000000000(0000) GS:ffff9d121db00000(0000) knlGS:0000000000000000
[Mon Nov  6 20:11:15 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Nov  6 20:11:15 2023] CR2: 00007ff1b6b97000 CR3: 0000000105d42002 CR4: 00000000003706e0
[Mon Nov  6 20:11:15 2023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Mon Nov  6 20:11:15 2023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Mon Nov  6 20:11:15 2023] Call Trace:
[Mon Nov  6 20:11:15 2023]  <TASK>
[Mon Nov  6 20:11:15 2023]  ? __warn+0x84/0x140
[Mon Nov  6 20:11:15 2023]  ? amdgpu_amdkfd_restore_userptr_worker+0x520/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023]  ? report_bug+0x199/0x1b0
[Mon Nov  6 20:11:15 2023]  ? handle_bug+0x42/0x70
[Mon Nov  6 20:11:15 2023]  ? exc_invalid_op+0x18/0x70
[Mon Nov  6 20:11:15 2023]  ? asm_exc_invalid_op+0x1a/0x20
[Mon Nov  6 20:11:15 2023]  ? amdgpu_amdkfd_restore_userptr_worker+0x520/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023]  ? amdgpu_amdkfd_restore_userptr_worker+0x520/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023]  process_one_work+0x1e2/0x3f0
[Mon Nov  6 20:11:15 2023]  ? __pfx_worker_thread+0x10/0x10
[Mon Nov  6 20:11:15 2023]  worker_thread+0x54/0x3a0
[Mon Nov  6 20:11:15 2023]  ? __pfx_worker_thread+0x10/0x10
[Mon Nov  6 20:11:15 2023]  kthread+0xda/0x110
[Mon Nov  6 20:11:15 2023]  ? __pfx_kthread+0x10/0x10
[Mon Nov  6 20:11:15 2023]  ret_from_fork+0x29/0x50
[Mon Nov  6 20:11:15 2023]  </TASK>
[Mon Nov  6 20:11:15 2023] ---[ end trace 0000000000000000 ]---
[Mon Nov  6 20:11:15 2023] ------------[ cut here ]------------
[Mon Nov  6 20:11:15 2023] User pages unexpectedly invalid
[Mon Nov  6 20:11:15 2023] WARNING: CPU: 4 PID: 725464 at drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:2600 amdgpu_amdkfd_restore_userptr_worker+0x59d/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023] Modules linked in: snd_seq_dummy snd_seq tls rfcomm cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel btmtk bluetooth ecdh_generic ecc exfat uas usb_storage qrtr nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter overlay af_packet xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat rfkill msr snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils soundwire_bus snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_realtek snd_soc_core snd_hda_codec_generic intel_rapl_msr intel_rapl_common ledtrig_audio snd_compress intel_tcc_cooling snd_hda_codec_hdmi ac97_bus snd_usb_audio snd_pcm_dmaengine snd_hda_intel x86_pkg_temp_thermal intel_powerclamp coretemp snd_intel_dspcfg crc32_pclmul polyval_clmulni snd_intel_sdw_acpi polyval_generic ip6_tables gf128mul
[Mon Nov  6 20:11:15 2023]  snd_hda_codec snd_usbmidi_lib ghash_clmulni_intel sha512_ssse3 sha512_generic snd_rawmidi snd_seq_device aesni_intel snd_hda_core mc libaes cmdlinepart crypto_simd snd_hwdep spi_nor cryptd r8169 iTCO_wdt snd_pcm intel_pmc_bxt watchdog mei_hdcp mei_pxp xt_conntrack rapl ee1004 mtd mfd_core realtek snd_timer mdio_devres nf_conntrack mei_me intel_cstate input_leds evdev snd mousedev led_class joydev nf_defrag_ipv6 i2c_i801 spi_intel_pci mac_hid nf_defrag_ipv4 libphy intel_uncore soundcore gigabyte_wmi i2c_smbus intel_wmi_thunderbolt spi_intel wmi_bmof mei intel_pch_thermal edac_core thermal fan ip6t_rpfilter ipt_rpfilter tiny_power_button pinctrl_cannonlake intel_pmc_core acpi_pad button xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nf_tables libcrc32c nfnetlink sch_fq_codel uinput ctr atkbd libps2 serio vivaldi_fmap loop cpufreq_powersave tap macvlan veth bridge stp llc tun kvm_intel kvm irqbypass fuse pstore configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 hid_generic
[Mon Nov  6 20:11:15 2023]  usbhid hid sd_mod ahci libahci xhci_pci xhci_pci_renesas xhci_hcd libata nvme usbcore nvme_core scsi_mod t10_pi crc32c_intel crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common usb_common scsi_common rtc_cmos dm_mod dax amdgpu i2c_algo_bit drm_ttm_helper ttm agpgart video wmi iommu_v2 drm_buddy gpu_sched drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt drm i2c_core backlight
[Mon Nov  6 20:11:15 2023] CPU: 4 PID: 725464 Comm: kworker/4:1 Tainted: G        W          6.3.8 #1-NixOS
[Mon Nov  6 20:11:15 2023] Hardware name: Gigabyte Technology Co., Ltd. Z390 UD/Z390 UD, BIOS F10 11/05/2021
[Mon Nov  6 20:11:15 2023] Workqueue: events amdgpu_amdkfd_restore_userptr_worker [amdgpu]
[Mon Nov  6 20:11:15 2023] RIP: 0010:amdgpu_amdkfd_restore_userptr_worker+0x59d/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023] Code: 28 e8 87 d9 14 eb e9 8d fd ff ff 48 c7 c7 39 c9 aa c0 e8 36 b5 0c eb 0f 0b e9 f3 fe ff ff 48 c7 c7 b0 6f a5 c0 e8 23 b5 0c eb <0f> 0b e9 49 fe ff ff 48 c7 c6 80 84 92 c0 48 c7 c7 d0 6f a5 c0 e8
[Mon Nov  6 20:11:15 2023] RSP: 0000:ffffae6e0445fdf8 EFLAGS: 00010282
[Mon Nov  6 20:11:15 2023] RAX: 0000000000000000 RBX: ffff9d0c892b0fe8 RCX: 0000000000000027
[Mon Nov  6 20:11:15 2023] RDX: ffff9d121db1d5c8 RSI: 0000000000000001 RDI: ffff9d121db1d5c0
[Mon Nov  6 20:11:15 2023] RBP: ffff9d0c892b1070 R08: 0000000000000000 R09: 0000000100006cc4
[Mon Nov  6 20:11:15 2023] R10: ffffae6e0445fca0 R11: ffffffffacf1b6a0 R12: ffff9d0c892b1098
[Mon Nov  6 20:11:15 2023] R13: ffff9d0c892b1020 R14: 00000000fffffff5 R15: ffff9d10bae8a800
[Mon Nov  6 20:11:15 2023] FS:  0000000000000000(0000) GS:ffff9d121db00000(0000) knlGS:0000000000000000
[Mon Nov  6 20:11:15 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Nov  6 20:11:15 2023] CR2: 00007ff1b6b97000 CR3: 0000000105d42002 CR4: 00000000003706e0
[Mon Nov  6 20:11:15 2023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Mon Nov  6 20:11:15 2023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Mon Nov  6 20:11:15 2023] Call Trace:
[Mon Nov  6 20:11:15 2023]  <TASK>
[Mon Nov  6 20:11:15 2023]  ? __warn+0x84/0x140
[Mon Nov  6 20:11:15 2023]  ? amdgpu_amdkfd_restore_userptr_worker+0x59d/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023]  ? report_bug+0x199/0x1b0
[Mon Nov  6 20:11:15 2023]  ? handle_bug+0x42/0x70
[Mon Nov  6 20:11:15 2023]  ? exc_invalid_op+0x18/0x70
[Mon Nov  6 20:11:15 2023]  ? asm_exc_invalid_op+0x1a/0x20
[Mon Nov  6 20:11:15 2023]  ? amdgpu_amdkfd_restore_userptr_worker+0x59d/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023]  ? amdgpu_amdkfd_restore_userptr_worker+0x59d/0x5e0 [amdgpu]
[Mon Nov  6 20:11:15 2023]  process_one_work+0x1e2/0x3f0
[Mon Nov  6 20:11:15 2023]  ? __pfx_worker_thread+0x10/0x10
[Mon Nov  6 20:11:15 2023]  worker_thread+0x54/0x3a0
[Mon Nov  6 20:11:15 2023]  ? __pfx_worker_thread+0x10/0x10
[Mon Nov  6 20:11:15 2023]  kthread+0xda/0x110
[Mon Nov  6 20:11:15 2023]  ? __pfx_kthread+0x10/0x10
[Mon Nov  6 20:11:15 2023]  ret_from_fork+0x29/0x50
[Mon Nov  6 20:11:15 2023]  </TASK>
[Mon Nov  6 20:11:15 2023] ---[ end trace 0000000000000000 ]---
[Mon Nov  6 20:11:15 2023] ------------[ cut here ]------------

Commit

No response

Where did you install darktable from?

distro packaging

darktable version

4.0.0

What OS are you using?

Linux

What is the version of your OS?

NixOS 22.11.4588.93fddcf640c

Describe your system?

Hardware

Software

Are you using OpenCL GPU in darktable?

Yes

If yes, what is the GPU card and driver?

AMD Radeon RX 6600 XT (gfx1032) and driver from linux kernel 6.3.8

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

No response

jenshannoschwalm commented 8 months ago

3 comments

1) there have been quite a number of fixes done for dt 4.4, issues from older versions will not be tested if there is not hint for a very old persisting bug. 2) your AMD driver is certainly a problem :-) BTW it often is and there have been a number of fixes for amd in dt code. 3) please upgrade to 4.4 or even better to git master and test and possible report again. Use 'darktable -d pipe -d opencl' and share the log.

jenshannoschwalm commented 6 months ago

Closing as no response on bug in old version