hanatos / vkdt

raw photography workflow that sucks less
https://jo.dreggn.org/vkdt
BSD 2-Clause "Simplified" License
389 stars 36 forks source link

Notable latency when dragging sliders #133

Closed Eemilp closed 1 month ago

Eemilp commented 4 months ago

I'm experiencing notable latency when dragging sliders in vkdt 0.8.99-67-g6a75ba87 as well as previous versions. See this video. This is with 16mpix image and level of detail setting at 1, with a AMD Ryzen 5 PRO 4650U and its iGPU.

I'm aware that this is a bit the case of "can i speed up rendering on my 2012 on-board GPU?", but darktable and rawtherapee manage to feel much smoother despite the processing being significantly slower.

hanatos commented 4 months ago

AMD Ryzen 5 PRO 4650U and its iGPU. ouch.

LOD<=1 is the highest possible, i.e. it will always render full res. did you try to set to at least 2?

also you're probably aware that these sliders are nuklear properties, i.e. the mouse doesn't directly correspond to the grey bar but has extended precision by using the whole screen as range.

i don't really feel like optimising for devices below a certain level (other than the already existing LOD switch). especially i don't want to implement a complicated monster as for darktable again (cropped and downsized pixel pipelines running at the same time and asynchronously, sync nightmares included). seems it's not very hard to jump over a threshold of hardware capability to make this unnecessary (bringing only bloat, code complexity, and in fact slowdown).

Eemilp commented 3 months ago

LOD<=1 is the highest possible, i.e. it will always render full res. did you try to set to at least 2?

Yes, at 2 it's quite usable already, but I do like zooming into my images.

also you're probably aware that these sliders are nuklear properties, i.e. the mouse doesn't directly correspond to the grey bar but has extended precision by using the whole screen as range.

Somehow I managed to miss this, thanks!

i don't really feel like optimising for devices below a certain level (other than the already existing LOD switch). especially i don't want to implement a complicated monster as for darktable again (cropped and downsized pixel pipelines running at the same time and asynchronously, sync nightmares included). seems it's not very hard to jump over a threshold of hardware capability to make this unnecessary (bringing only bloat, code complexity, and in fact slowdown).

Makes sense. I guess it would be nice for the pixel pipeline to not block the gui, but that is added complexity of course. This GPU is just too weak for processing in real time despite plenty of vram.

At least I now have another excuse to get a new PC :)

hanatos commented 3 months ago

i think i may change my mind wrt older or weaker devices. new gpus have ridiculous prices. might look into async processing, could be not too much trouble.

On Mon, Jul 29, 2024, 11:09 Eemil P @.***> wrote:

LOD<=1 is the highest possible, i.e. it will always render full res. did you try to set to at least 2?

Yes, at 2 it's quite usable already, but I do like zooming into my images.

also you're probably aware that these sliders are nuklear properties, i.e. the mouse doesn't directly correspond to the grey bar but has extended precision by using the whole screen as range.

Somehow I managed to miss this, thanks!

i don't really feel like optimising for devices below a certain level (other than the already existing LOD switch). especially i don't want to implement a complicated monster as for darktable again (cropped and downsized pixel pipelines running at the same time and asynchronously, sync nightmares included). seems it's not very hard to jump over a threshold of hardware capability to make this unnecessary (bringing only bloat, code complexity, and in fact slowdown).

Makes sense. I guess it would be nice for the pixel pipeline to not block the gui, but that is added complexity of course. This GPU is just too weak for processing in real time despite plenty of vram.

At least I now have another excuse to get a new PC :)

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/issues/133#issuecomment-2255402619, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKPXMNETUDTX3ZLXZY3ZOYBFZAVCNFSM6AAAAABLKMEHUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJVGQYDENRRHE . You are receiving this because you commented.Message ID: @.***>

hanatos commented 3 months ago

pushed some async compute in the background, redrawing ui at different frequency. let me know if that works for you. maybe you want to also set the frame_limiter in ~/.config/vkdt/config.rc so it doesn't refresh at idiotically high frame rates.

Eemilp commented 3 months ago

Tested by building the latest master (463d255). It still stutters a bit, but a definite improvement! (Here is a clip). Adding a frame limiter helps indeed too, in my case 16 is a small improvement and around 50 seems to make the mouse feel smooth. Overall a big improvement for usability. Playing around a bit I also noticed that reducing the sensitivity of the sliders also made interaction easier despite the stuttering.

hanatos commented 2 months ago

okay.. this sounds like there is still some interference between the various "clock domains" involved here. i suppose it's time to reimplement the synchronisation with timeline semaphores here.

hanatos commented 2 months ago

the new git version comes with a reimplementation of all the synchronisation code via timeline semaphores. they are simply the better concept, this was actually fun to do. in my limited tests this seems to work well for still images, videos, stopped videos, and interactive rendering with grabbed mouse/keyboard (quake).

if you'd like to test again i'd be interested how it works on the lower end of hardware. for smooth ui rendering i'd certainly dial down the frame_limiter to 16ms (60Hz) or even 6ms (160Hz) depending on the refresh rate supported by the screen.

Eemilp commented 2 months ago

Unfortunately the performance seems about the same on my hardware (clip). Here the frame limiter is 16ms.

hanatos commented 1 month ago

hm okay of course async compute doesn't make anything go faster.. it will only make things feel slightly more fluid on almost acceptable hardware. for instance i tried this on a macbook air from 2018 with an onboard intel, which is well below acceptable (workable only with LOD=3). i tried on an MSI notebook with an nvidia 1650 GTX, which falls into the "acceptable" category.

closing this particular issue as completed, let's make a new issue if there are other ideas to improve things for older hardware.

Eemilp commented 1 month ago

Sounds very reasonable. I've been using vkdt like this for now and it's certainly usable, despite not being fast.

Thanks :)