hanatos / vkdt

raw photography workflow that sucks less
https://vkdt.org
BSD 2-Clause "Simplified" License
364 stars 32 forks source link

Add support for Mac OS #87

Open LucaZulberti opened 9 months ago

LucaZulberti commented 9 months ago

Dear all, I appreciate your work here and would like to try porting vkDT to Mac OS.

I'm not an experienced Mac OS developer; I want to give it a try to use the software on my MacBook.

My setup:

Dependency (to improve the list):

ToDo:

Up to now, the commits are just a fast try to make it work. I know there is a lot of rubbish from many points of view of programmers 😄

Thank you again for this software!

hanatos commented 9 months ago

heya,

nice, thanks for jumping onto this. as a general comment, the code with these patches applied will probably not run on my machine any more (or on macintosh with intel cpu?). also i want to limit platform specific code to very few places to make our life easier in the future. let me leave a few comments interleaved in the patch.

hanatos commented 9 months ago

.. about rawspeed. does the vanilla rawspeed git build on M1? i would be surprised if not. anyways it's probably better to use upstream rawspeed for changes.

about the blank screen. did you run vkdt -d all and does it give any useful hints? maybe it fails to detect paths to dsos for dynamic loading?

LucaZulberti commented 9 months ago

Here I am!

If you appreciate it, I'm working to make it compile with CMake, which does more platform-specific controls. Next, I wanted to play with ImGUI to build the GUI and integrate it in Ansel in the long (very long?) run.

I also updated ImGUI dependency (I cherry-picked your modifications) to check for the blank screen problem. As before, I'm going to investigate more and then make the changes more solid.

Thank you for your feedback!

LucaZulberti commented 9 months ago

With the new imgui this is the error:

$ mkdir build
$ cd build
$ cmake .. && Make
$ cd ../bin
$ cp ../build/vkdt* ../build/libvkdt.* ./
$ ln -s ../build/modules ./
$ ./vkdt-gui -d err -d all -D perf ~/Pictures/Temp/ProvaDT/AND_092.dng 
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa
[gui] monitor [0] DELL U2721DE at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[qvk] dev 0: vendorid 0x106b
[qvk] dev 0: Apple M1
[qvk] max number of allocations 1073741824
[qvk] max image allocation size 16384 x 16384
[qvk] max uniform buffer range 4294967295
[qvk] num queue families: 4
[qvk] picked device 0 without ray tracing and without float atomics support
[qvk] available surface formats:
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] colour space: 0
[gui] no joysticks found
[gui] no display profile file display.DELL U2721DE, using sRGB!
[gui] no display profile file display.DELL U2721DE, using sRGB!
[qvk] available surface formats:
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] B8G8R8A8_UNORM
[qvk] B8G8R8A8_SRGB
[qvk] R16G16B16A16_SFLOAT
[qvk] A2B10G10R10_UNORM_PACK32
[qvk] A2R10G10B10_UNORM_PACK32
[qvk] colour space: 0
Assertion failed: (font_cfg->SizePixels > 0.0f), function AddFont, file imgui_draw.cpp, line 2131.
an error occurred while trying to execute gdb.please check if gdb is installed on your system.
backtrace written to /tmp/vkdt-bt-95883.txt
recovery data written to /tmp/vkdt-crash-recovery.*

Apparently, it cannot set the WindowSize properly... I saw you modified imgui backends for that. In my plans I would like to use ImGui:: instead of lower level functions from glfw. It will be a longer process, but I think it could help in having something portable in the long run.

LucaZulberti commented 9 months ago

Updates on error state.

I set the font size to 16.0 with a literal to go on.

Now blocked at:

$ ./vkdt-gui -d err -d all -D qvk -D perf ~/Pictures/Temp/ProvaDT/AND_092.dng
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa
[gui] monitor [0] DELL U2721DE at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[gui] no joysticks found
[gui] no display profile file display.DELL U2721DE, using sRGB!
[gui] no display profile file display.DELL U2721DE, using sRGB!
[db] allocating 1024.0 MB for thumbnails
[i-bc1] (null): can't open file!
[i-bc1] (null): wrong magic number or version!
[i-bc1] (null): can't open file!
[i-bc1] (null): wrong magic number or version!
[ERR] [thm] failed to run first half of graph!
[ERR] could not load required thumbnail symbols!
[ERR] image could not be loaded!

I think this is related to how the dynamic libraries of the modules are compiled. I will check compilation flags and other differences with Linux .so files.

hanatos commented 9 months ago

okay is that error any different to the older imgui at all?

not sure i understand how you would put imgui into a gtk3/cairo application. as additional dependency?

maybe a saner approach is to use libvkdt.so in an external application, if you want to backport the processing to legacy gui library stacks? or do you want to rewrite ansel in imgui?

about dynamic libraries. i saw you link -lvkdt in the makefile. why? normally the library is not even built.

there may also be an issue about setting file paths to discover additional data/config files/dsos.

i'm highly unlikely to merge cmake build system patches. cmake doesn't do anything for me (it generates makefiles, i have makefiles already), i'm unconvinced it does much for platform independence except mostly a big if around stuff. all variables related to local build environments should go in config.mk and config.mk.defaults instead.

re:imgui vs glfw functions: some code is c, some is c++ (as minimal as possible, only gui related and interfacing with rawspeed/exiv2). glfw is c, so there are some constraints as to when you can swap things around.

LucaZulberti commented 9 months ago

okay is that error any different to the older imgui at all?

I'm going to investigate the cause in the following days.

not sure i understand how you would put imgui into a gtk3/cairo application. as additional dependency? maybe a saner approach is to use libvkdt.so in an external application, if you want to backport the processing to legacy gui library stacks? or do you want to rewrite ansel in imgui?

I would like essential ImGui components to control the processing on a single image, then extend for library management like in Ansel. Yes, I would like to get rid of GTK. For sure I will link with libvkdt to keep QVK stuff as it is.

about dynamic libraries. i saw you link -lvkdt in the makefile. why? normally the library is not even built.

Bad patch due to ramp up on the compilation flow of the project; I will force push the next days.

there may also be an issue about setting file paths to discover additional data/config files/dsos.

Thank you for the hint, I will investigate also this.

i'm highly unlikely to merge cmake build system patches. cmake doesn't do anything for me (it generates makefiles, i have makefiles already), i'm unconvinced it does much for platform independence except mostly a big if around stuff. all variables related to local build environments should go in config.mk and config.mk.defaults instead.

I will keep it separated. I found it more convenient than hand-written Makefiles for the long-term, but it could be optional for this work. For example, in shared libraries, it automatically handles all the different flags for other OSes. I will continue to modify the Makefile-based compilation flow too.

re:imgui vs glfw functions: some code is c, some is c++ (as minimal as possible, only gui related and interfacing with rawspeed/exiv2). glfw is c, so there are some constraints as to when you can swap things around.

Here, I need more experience in ImGui development to be helpful. I noticed the patch about setDisplayProfile() that I assume is very important for the workflow. This should be considered in any GUI-based application using the libvkdt, right? I will postpone these long-term decisions until I can run this tool on macOS.

LucaZulberti commented 9 months ago

Updated with only Makefile modifications. Cleaned ext compilation with brew llvm. Fixed some flat.mk in src/pipe/modules; @hanatos can you check these? I need those dependencies with .o files.

Now the error is:

$ ./vkdt -d err -d all -D qvk -D perf ~/Pictures/Temp/ProvaDT/AND_092.dng
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa dynamic
[gui] monitor [0] DELL U2721DE at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[gui] no joysticks found
[gui] no display profile file display.DELL U2721DE, using sRGB!
[gui] no display profile file display.DELL U2721DE, using sRGB!
[db] allocating 1024.0 MB for thumbnails
[ERR] [thm] failed to run first half of graph!
[ERR] could not load required thumbnail symbols!
[ERR] image could not be loaded!

Same error with updated ImGui (HEAD) and old version (HEAD~2).

hanatos commented 9 months ago

looks like you're lacking -rdynamic or the equivalent?

LucaZulberti commented 9 months ago

Now Makefiles are more clean.

Next error:

$ ./vkdt -d err -d all -D qvk -D perf ~/Pictures/Temp/ProvaDT/AND_092.dng
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa dynamic
[gui] monitor [0] DELL U2721DE at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[gui] no joysticks found
[gui] no display profile file display.DELL U2721DE, using sRGB!
[gui] no display profile file display.DELL U2721DE, using sRGB!
[db] allocating 1024.0 MB for thumbnails
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB
[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Shader library compile failed (Error code 3):
program_source:104:14: error: expected unqualified-id
float kernel(thread const float3& ci, thread const float3& p)
             ^
program_source:104:14: error: expected ')'
program_source:104:13: note: to match this '('
float kernel(thread const float3& ci, thread const float3& p)
            ^
program_source:285:42: error: expected expression
            co += (params.rbf_c[i].xyz * kernel(param_8, param_9));
                                         ^
.
[mvk-error] VK_ERROR_INVALID_SHADER_NV: Compute shader function could not be compiled into pipeline. See previous logged error.
[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Shader library compile failed (Error code 3):
program_source:104:14: error: expected unqualified-id
float kernel(thread const float3& ci, thread const float3& p)
             ^
program_source:104:14: error: expected ')'
program_source:104:13: note: to match this '('
float kernel(thread const float3& ci, thread const float3& p)
            ^
program_source:285:42: error: expected expression
            co += (params.rbf_c[i].xyz * kernel(param_8, param_9));
                                         ^
.
[mvk-error] VK_ERROR_INVALID_SHADER_NV: Compute shader function could not be compiled into pipeline. See previous logged error.
[db] [thm] running the thumbnail graph failed on image '/Users/luca/Pictures/Temp/ProvaDT/AND_092.dng.cfg'!

Same error for old and new ImGui+GLFW. Maybe is related to MoltenVK implementation of Vulkan on macOS?

hanatos commented 9 months ago

the compilation issue: could it be that kernel is a reserved keyword in metal? maybe try to rename it in the vkdt glsl to something else (kern for instance)?

LucaZulberti commented 9 months ago

Renaming resolved it.

Now, errors come from the validation step while creating shaders from .spv files.

First one:

[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Shader library compile failed (Error code 3):
program_source:85:390: error: 'sampler' attribute parameter is out of bounds: must be between 0 and 15
kernel void main0(constant push_t& push [[buffer(0)]], texture2d<float> img_coarse [[texture(0)]], array<texture2d<float>, 11> img_l0 [[texture(11)]], array<texture2d<float>, 11> img_l1 [[texture(22)]], texture2d<float, access::write> img_out [[texture(33)]], sampler img_coarseSmplr [[sampler(0)]], array<sampler, 11> img_l0Smplr [[sampler(11)]], array<sampler, 11> img_l1Smplr [[sampler(22)]], uint3 gl_GlobalInvocationID [[thread_position_in_grid]])

The second one (after launching the application within the MoltenVk configurator):

VUID-VkPipelineLayoutCreateInfo-descriptorType-03020(ERROR / SPEC): msgNum: -696627467 - Validation Error: [ VUID-VkPipelineLayoutCreateInfo-descriptorType-03020 ] | MessageID = 0xd67a4ef5 | vkCreatePipelineLayout():
max per-stage storage image bindings count (11) exceeds device maxPerStageDescriptorStorageImages limit (8).
The Vulkan spec states: The total number of descriptors in descriptor set layouts created without the 
VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT bit set with a descriptorType of
VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, and VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER accessible to
any given shader stage across all elements of pSetLayouts must be less than or equal to 
VkPhysicalDeviceLimits::maxPerStageDescriptorStorageImages (https://vulkan.lunarg.com/doc/view/1.3.261.1/mac/1.3-extensions/vkspec.html#VUID-VkPipelineLayoutCreateInfo-descriptorType-03020)
    Objects: 0

I'm not familiar with Vulkan (yet, I'm going to follow the vulkan-tutorial), but it seems related to missing checks with the availability of the device (VkPhysicalDeviceLimits::*); I see this set of limits is not checked in the code. May the solution be to load more device properties inside the QVK module that export information about the underlying device in the qvk structure?

At this point, I will pause the debugging while deepening the Vulkan programming to debug possible problems.

Meanwhile, if you have any hints, I can implement them and see what happens.

hanatos commented 9 months ago

uh, it seems the M1/M2 is quite limited when it comes to the number of storage images you can use at the same time (it's 8 as compared to 1048576 on nvidia). is this the llap module? with storage buffers (porting that over would be possible) it's 31.. which is maybe enough for llap, but also the sampled textures are not nearly enough for the more texture intense modules (e.g. quake). looking for some macintosh documentation i'm not exactly sure this limit transferred correctly from metal render targets to vulkan compute shader storage images, but i know nothing about apple metal.

On Mon, Sep 11, 2023 at 6:55 PM Luca Zulberti @.***> wrote:

Renaming resolved it.

Now, errors come from the validation step while creating shaders from .spv files.

First one:

[mvk-error] VK_ERROR_INITIALIZATION_FAILED: Shader library compile failed (Error code 3): program_source:85:390: error: 'sampler' attribute parameter is out of bounds: must be between 0 and 15 kernel void main0(constant push_t& push [[buffer(0)]], texture2d img_coarse [[texture(0)]], array<texture2d, 11> img_l0 [[texture(11)]], array<texture2d, 11> img_l1 [[texture(22)]], texture2d<float, access::write> img_out [[texture(33)]], sampler img_coarseSmplr [[sampler(0)]], array<sampler, 11> img_l0Smplr [[sampler(11)]], array<sampler, 11> img_l1Smplr [[sampler(22)]], uint3 gl_GlobalInvocationID [[thread_position_in_grid]])

The second one (after launching the application within the MoltenVk configurator):

VUID-VkPipelineLayoutCreateInfo-descriptorType-03020(ERROR / SPEC): msgNum: -696627467 - Validation Error: [ VUID-VkPipelineLayoutCreateInfo-descriptorType-03020 ] | MessageID = 0xd67a4ef5 | vkCreatePipelineLayout(): max per-stage storage image bindings count (11) exceeds device maxPerStageDescriptorStorageImages limit (8). The Vulkan spec states: The total number of descriptors in descriptor set layouts created without the VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT bit set with a descriptorType of VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, and VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER accessible to any given shader stage across all elements of pSetLayouts must be less than or equal to VkPhysicalDeviceLimits::maxPerStageDescriptorStorageImages (https://vulkan.lunarg.com/doc/view/1.3.261.1/mac/1.3-extensions/vkspec.html#VUID-VkPipelineLayoutCreateInfo-descriptorType-03020) Objects: 0

I'm not familiar with Vulkan (yet, I'm going to follow the vulkan-tutorial), but it seems related to missing checks with the availability of the device (VkPhysicalDeviceLimits::*); I see this set of limits is not checked in the code. May the solution be to load more device properties inside the QVK module that export information about the underlying device in the qvk structure?

At this point, I will pause the debugging while deepening the Vulkan programming to debug possible problems.

Meanwhile, if you have any hints, I can implement them and see what happens.

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/pull/87#issuecomment-1714255117, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKO37ZEL4P5OUCWAJ6TXZ47BTANCNFSM6AAAAAA4QZIXNM . You are receiving this because you were mentioned.Message ID: @.***>

LucaZulberti commented 9 months ago

Thank you for the hints! Let's say I will come back here on the issue when I understand more about the Vulkan API, the shaders, and everything. Compilation issues were 1% of the problems.... 😅

LDAP commented 9 months ago

It seems that enabling argument buffers increases the limits significantly.

Argument buffers can be enabled either in the MVKConfiguration struct or with the MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS environment variable.

See https://github.com/KhronosGroup/MoltenVK/issues/1610

and https://github.com/KhronosGroup/MoltenVK/blob/aed91cb5631e112faa521fe5739935424b9a4675/MoltenVK/MoltenVK/API/mvk_config.h#L894

LucaZulberti commented 9 months ago

Thank you for the hint, @LDAP!

I rebased my PR on the latest changes. I moved stuff in config.defaults.mk to check if the target is OSX.

Now I'm stuck on:

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 ./vkdt -d err -d all -D perf ~/Downloads/Zulberti.jpg
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa dynamic
[gui] monitor [0] DELL U2721DE at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[qvk] dev 0: vendorid 0x106b
[qvk] dev 0: Apple M1
[qvk] max number of allocations 1073741824
[qvk] max image allocation size 16384 x 16384
[qvk] max uniform buffer range 4294967295
[qvk] num queue families: 4
[qvk] picked device 0 without ray tracing and without float atomics support
[gui] no joysticks found
[gui] no display profile file display.DELL U2721DE, using sRGB!
[gui] no display profile file display.DELL U2721DE, using sRGB!
[db] allocating 1024.0 MB for thumbnails
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing vkAllocateDescriptorSets(qvk.device, &dset_info_u, node->uniform_dset+1)!
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing alloc_outputs3(graph, graph->node+nodeid[i])!
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing vkAllocateDescriptorSets(qvk.device, &dset_info_u, node->uniform_dset+1)!
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing alloc_outputs3(graph, graph->node+nodeid[i])!
[db] [thm] running the thumbnail graph export failed on image '/Users/luca/Downloads/Zulberti.jpg.cfg'!

Could this be due to the .descriptorCount in graph.c:2493?

    VkDescriptorPoolSize pool_sizes[] = {{
      .type            = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
      .descriptorCount = 1+DT_GRAPH_MAX_FRAMES*graph->dset_cnt_image_read,
    }, {
      .type            = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
      .descriptorCount = 1+DT_GRAPH_MAX_FRAMES*graph->dset_cnt_image_write,
    }, {
      .type            = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
      .descriptorCount = 1+DT_GRAPH_MAX_FRAMES*graph->dset_cnt_buffer,
    }, {
      .type            = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
      .descriptorCount = 1+DT_GRAPH_MAX_FRAMES*(graph->num_nodes+graph->dset_cnt_uniform),
    }, {
      .type            = qvk.raytracing_supported ?
        VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_KHR :
        VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, // dummy for validation layer
      .descriptorCount = DT_GRAPH_MAX_FRAMES,
    }};
LucaZulberti commented 9 months ago

Update

It seems that after the rebase the max sample problem is not there anymore (or this new problem is hiding it?).

I have added some dt_log() to debug:

$ ./vkdt -d err -d all -D perf ~/Downloads/Zulberti.jpg
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa dynamic
[gui] monitor [0] DELL U2721DE at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[qvk] dev 0: vendorid 0x106b
[qvk] dev 0: Apple M1
[qvk] max number of allocations 1073741824
[qvk] max image allocation size 16384 x 16384
[qvk] max uniform buffer range 4294967295
[qvk] num queue families: 4
[qvk] picked device 0 without ray tracing and without float atomics support
[gui] no joysticks found
[gui] no display profile file display.DELL U2721DE, using sRGB!
[gui] no display profile file display.DELL U2721DE, using sRGB!
[db] allocating 1024.0 MB for thumbnails
[pipe] Graph 0x102d40260, run = 3
[pipe] Graph 0x102d40260, run = fffffffc
[pipe] Created DescriptorPool for maxSets = 12 (0x12a84a800)
[pipe]   Pool size [0]: type = 1, descriptorCount = 2
[pipe]   Pool size [1]: type = 3, descriptorCount = 2
[pipe]   Pool size [2]: type = 7, descriptorCount = 1
[pipe]   Pool size [3]: type = 6, descriptorCount = 8
[pipe]   Pool size [4]: type = 7, descriptorCount = 2
[pipe] alloc_outputs3 for graph 0x102d40260, node = 0x118840000, i = 0
[pipe] alloc_outputs3 for graph 0x102d40260, node = 0x1188418d0, i = 1
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a84a800
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a84a800
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a84a800
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB
[pipe] Graph 0x102d40260, run = 3
[pipe] Graph 0x102d40260, run = fffffffc
[pipe] alloc_outputs3 for graph 0x102d40260, node = 0x118840000, i = 0
[pipe] alloc_outputs3 for graph 0x102d40260, node = 0x1188418d0, i = 1
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a84a800
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a84a800
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a84a800
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB

[pipe] Graph 0x102d3eb90, run = ffffffff

[pipe] Created DescriptorPool for maxSets = 60 (0x12b010a00)
[pipe]   Pool size [0]: type = 1, descriptorCount = 24
[pipe]   Pool size [1]: type = 3, descriptorCount = 14
[pipe]   Pool size [2]: type = 7, descriptorCount = 1
[pipe]   Pool size [3]: type = 6, descriptorCount = 22
[pipe]   Pool size [4]: type = 7, descriptorCount = 2
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x128000000, i = 0
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x1280018d0, i = 1
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12b010a00
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x1280031a0, i = 2
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12b010a00
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x128004a70, i = 3
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12b010a00
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x128006340, i = 4
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12b010a00
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x128007c10, i = 5
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12b010a00
[pipe] alloc_outputs3 for graph 0x102d3eb90, node = 0x1280094e0, i = 6
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12b010a00
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12b010a00
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing vkAllocateDescriptorSets(qvk.device, &dset_info_u, node->uniform_dset+1)!
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing alloc_outputs3(graph, graph->node+nodeid[I])!

[pipe] Graph 0x102d42448, export = 0x16d265918
[pipe] Destroying DescriptorPool 0x12b010a00
[pipe] Graph 0x102d42448, run = ffffffff

[pipe] Created DescriptorPool for maxSets = 50 (0x12a8bb000)
[pipe]   Pool size [0]: type = 1, descriptorCount = 20
[pipe]   Pool size [1]: type = 3, descriptorCount = 12
[pipe]   Pool size [2]: type = 7, descriptorCount = 1
[pipe]   Pool size [3]: type = 6, descriptorCount = 18
[pipe]   Pool size [4]: type = 7, descriptorCount = 2
[pipe] alloc_outputs3 for graph 0x102d42448, node = 0x10a000000, i = 0
[pipe] alloc_outputs3 for graph 0x102d42448, node = 0x10a0018d0, i = 1
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a8bb000
[pipe] alloc_outputs3 for graph 0x102d42448, node = 0x10a0031a0, i = 2
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a8bb000
[pipe] alloc_outputs3 for graph 0x102d42448, node = 0x10a004a70, i = 3
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a8bb000
[pipe] alloc_outputs3 for graph 0x102d42448, node = 0x10a006340, i = 4
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a8bb000
[pipe] alloc_outputs3 for graph 0x102d42448, node = 0x10a007c10, i = 5
[pipe] Allocating 2 node->dset DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[0] DescriptorSets from pool 0x12a8bb000
[pipe] Allocating 1 node->uniform_dset[1] DescriptorSets from pool 0x12a8bb000
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing vkAllocateDescriptorSets(qvk.device, &dset_info_u, node->uniform_dset+1)!
[qvk] error VK_ERROR_OUT_OF_POOL_MEMORY executing alloc_outputs3(graph, graph->node+nodeid[i])!
[db] [thm] running the thumbnail graph export failed on image '/Users/luca/Downloads/Zulberti.jpg.cfg'!
[pipe] Destroying DescriptorPool 0x12a84a800
[pipe] Destroying DescriptorPool 0x0
[pipe] Destroying DescriptorPool 0x12a8bb000
[pipe] Destroying DescriptorPool 0x0

The maxSets seems to be ok wrt the requested number of descriptor sets. It fails for dt_graph_run invocation in dt_graph_export function called by dt_thumbnails_cache_one.

Searching online for VK_ERROR_OUT_OF_POOL_MEMORY error I found the following:

It seems to always fail on the second uniform buffer allocation (type 6). The dset_layout_uniform has two bindings with 1 uniform buffer each. Following the request of allocation, they are exceeding the available descriptors in the pool (18 available, 20 requested in the second allocation for example).

Indeed, if I put a 10 * ... here:

      {
        .type            = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
        .descriptorCount = MAX(1, 10 * DT_GRAPH_MAX_FRAMES * (graph->num_nodes + graph->dset_cnt_uniform)),
      },

I have no more allocating errors, but only a black screen... 🥲

Apart from the black screen, I think the pool_sizes must be fixed. I will try to find the correct number of descriptors, but maybe you can do it faster understanding the code better than me 😄

hanatos commented 9 months ago

black screen sounds like progress ;) does it also not even show any imgui controls at all or is just the center region black?

let me double check the uniform descriptor allocation too. maybe for thumbnails something is not cleaned up properly?

LucaZulberti commented 9 months ago

It does not show anything... But if I start to click randomly in the higher part of the frame, a fatal error for freeing up an invalid pointer happens and I need to force quit the application. I think is better to solve one issue at a time 😄

LucaZulberti commented 8 months ago

Hi @hanatos,

I rebased the branch on top of the latest changes. Now the behaviour is:

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 ./vkdt -d err -d db ~/Pictures/Temp/ProvaDT/AND_092_noxmp.dng
[gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa dynamic
[gui] monitor [0] T24D391 at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[gui] no joysticks found
[gui] no display profile file display.T24D391, using sRGB!
[gui] no display profile file display.T24D391, using sRGB!
[db] allocating 1024.0 MB for thumbnails

The window shows up with an empty screen.

When I close the window:

[mvk-error] VK_ERROR_DEVICE_LOST: MTLCommandBuffer "vkQueueSubmit CommandBuffer on Queue 0-0" execution failed (code 3): Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault)
[db] [thm] running the thumbnail graph export failed on image '/Users/luca/Pictures/Temp/ProvaDT/AND_092_noxmp.dng.cfg'!

Do you think this error is related to the missing controls on the GUI? It seems it is stuck while running the thumbnail graph export.

UPDATE If I comment the thumbnail loading block the error on exit does not show up, but the view is still empty...

I will investigate the GUI code, hoping to grab more info.

hanatos commented 8 months ago

right this looks related. to debug the device operation, maybe run it through the cli to take away threading issues? maybe like

vkdt cli -d all -g ./path/to/some.raw.cfg

or run through strace to see if it's a simple searchpath issue? i'd look for failed calls to open

On Sat, 21 Oct 2023, 09:12 Luca Zulberti, @.***> wrote:

Hi @hanatos https://github.com/hanatos,

I rebased the branch on top of the latest changes. Now the behaviour is:

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 ./vkdt -d err -d db ~/Pictures/Temp/ProvaDT/AND_092_noxmp.dng [gui] glfwGetVersionString() : 3.3.8 Cocoa NSGL EGL OSMesa dynamic [gui] monitor [0] T24D391 at 0 0 [gui] vk extension required by GLFW: [gui] VK_KHR_surface [gui] VK_EXT_metal_surface [gui] no joysticks found [gui] no display profile file display.T24D391, using sRGB! [gui] no display profile file display.T24D391, using sRGB! [db] allocating 1024.0 MB for thumbnails

The window shows up with an empty screen.

When I close the window:

[mvk-error] VK_ERROR_DEVICE_LOST: MTLCommandBuffer "vkQueueSubmit CommandBuffer on Queue 0-0" execution failed (code 3): Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault) [db] [thm] running the thumbnail graph export failed on image '/Users/luca/Pictures/Temp/ProvaDT/AND_092_noxmp.dng.cfg'!

Do you think this error is related to the missing controls on the GUI? It seems it is stuck while running the thumbnail graph export.

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/pull/87#issuecomment-1773701015, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKIEEVU6ZWOHXNNYKYTYANYU3AVCNFSM6AAAAAA4QZIXNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTG4YDCMBRGU . You are receiving this because you were mentioned.Message ID: @.***>

LucaZulberti commented 8 months ago

Here the output (and the jpeg has been generated):

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 ./vkdt cli -d all -g ~/Pictures/Temp/ProvaDT/AND_092_noxmp.dng.cfg
[qvk] dev 0: vendorid 0x106b
[qvk] dev 0: Apple M1
[qvk] max number of allocations 1073741824
[qvk] max image allocation size 16384 x 16384
[qvk] max uniform buffer range 4294967295
[qvk] num queue families: 4
[qvk] picked device 0 without ray tracing and without float atomics support
[perf] upload source total:    3.772 ms
[perf] create raytrace accel:      0.000 ms
[perf] record command buffer:      0.289 ms
[mem] images : peak rss 786.705 MB vmsize 786.717 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 118.437 MB vmsize 118.437 MB
[perf] record cmd buffer:     81.971 ms
[o-jpg] writing 'main'
[perf] i-raw    main    :    471.465 ms
[perf] denoise  noop    :   11590.759 ms
[perf] hilite   half    :      0.510 ms
[perf] hilite   reduce  :      0.833 ms
[perf] hilite   reduce  :      0.432 ms
[perf] hilite   reduce  :      0.432 ms
[perf] hilite   reduce  :      0.528 ms
[perf] hilite   reduce  :      0.505 ms
[perf] hilite   reduce  :      0.425 ms
[perf] hilite   reduce  :      0.434 ms
[perf] hilite   reduce  :      0.432 ms
[perf] hilite   reduce  :      0.580 ms
[perf] hilite   reduce  :      0.462 ms
[perf] hilite   assemble:      0.448 ms
[perf] hilite   assemble:      0.590 ms
[perf] hilite   assemble:      0.530 ms
[perf] hilite   assemble:      0.465 ms
[perf] hilite   assemble:      0.408 ms
[perf] hilite   assemble:      0.559 ms
[perf] hilite   assemble:      0.425 ms
[perf] hilite   assemble:      0.503 ms
[perf] hilite   assemble:      0.314 ms
[perf] hilite   assemble:      0.863 ms
[perf] hilite   doub    :      0.335 ms
[perf] sum hilite:    11.016 ms
[perf] demosaic down    :      0.597 ms
[perf] demosaic gauss   :      0.415 ms
[perf] demosaic splat   :      0.590 ms
[perf] demosaic fix     :      0.413 ms
[perf] sum demosaic:       2.016 ms
[perf] crop     main    :      0.637 ms
[perf] colour   main    :      0.563 ms
[perf] filmcurv main    :      0.918 ms
[perf] llap     curve   :      0.314 ms
[perf] llap     reduce  :      0.472 ms
[perf] llap     reduce  :      0.651 ms
[perf] llap     reduce  :      0.632 ms
[perf] llap     reduce  :      0.422 ms
[perf] llap     reduce  :      0.606 ms
[perf] llap     reduce  :      0.441 ms
[perf] llap     reduce  :      0.536 ms
[perf] llap     reduce  :      0.467 ms
[perf] llap     reduce  :      0.415 ms
[perf] llap     reduce  :      2.889 ms
[perf] llap     reduce  :      1.092 ms
[perf] llap     assemble:      1.030 ms
[perf] llap     assemble:      2.443 ms
[perf] llap     assemble:      1.016 ms
[perf] llap     assemble:      0.851 ms
[perf] llap     assemble:      1.962 ms
[perf] llap     assemble:      0.854 ms
[perf] llap     assemble:      0.849 ms
[perf] llap     assemble:      2.957 ms
[perf] llap     assemble:      0.873 ms
[perf] llap     assemble:      1.016 ms
[perf] llap     assemble:      2.335 ms
[perf] llap     colour  :      0.884 ms
[perf] sum llap:      26.005 ms
[perf] f2srgb   main    :      1.955 ms
[perf] total time:  12159.874 ms

Now the error is not showing up... (still an empty window with the GUI command) I don't know what is going on here. It seems related to what is running on the Mac while vkdt is executed. Maybe some missing checks on, idk, available memory...

I try to modify the GUI code to see if something shows up at a certain point.

hanatos commented 8 months ago

cool! that's really encouraging i think! maybe the gui does access some database and thumbnail related stuff that's going wrong?

what does

strace vkdt 2>&1 | grep open

show? (is there strace on macintosh?)

On Sat, 21 Oct 2023, 12:16 Luca Zulberti, @.***> wrote:

Here the output (and the jpeg has been generated):

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 ./vkdt cli -d all -g ~/Pictures/Temp/ProvaDT/AND_092_noxmp.dng.cfg [qvk] dev 0: vendorid 0x106b [qvk] dev 0: Apple M1 [qvk] max number of allocations 1073741824 [qvk] max image allocation size 16384 x 16384 [qvk] max uniform buffer range 4294967295 [qvk] num queue families: 4 [qvk] picked device 0 without ray tracing and without float atomics support [perf] upload source total: 3.772 ms [perf] create raytrace accel: 0.000 ms [perf] record command buffer: 0.289 ms [mem] images : peak rss 786.705 MB vmsize 786.717 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 118.437 MB vmsize 118.437 MB [perf] record cmd buffer: 81.971 ms [o-jpg] writing 'main' [perf] i-raw main : 471.465 ms [perf] denoise noop : 11590.759 ms [perf] hilite half : 0.510 ms [perf] hilite reduce : 0.833 ms [perf] hilite reduce : 0.432 ms [perf] hilite reduce : 0.432 ms [perf] hilite reduce : 0.528 ms [perf] hilite reduce : 0.505 ms [perf] hilite reduce : 0.425 ms [perf] hilite reduce : 0.434 ms [perf] hilite reduce : 0.432 ms [perf] hilite reduce : 0.580 ms [perf] hilite reduce : 0.462 ms [perf] hilite assemble: 0.448 ms [perf] hilite assemble: 0.590 ms [perf] hilite assemble: 0.530 ms [perf] hilite assemble: 0.465 ms [perf] hilite assemble: 0.408 ms [perf] hilite assemble: 0.559 ms [perf] hilite assemble: 0.425 ms [perf] hilite assemble: 0.503 ms [perf] hilite assemble: 0.314 ms [perf] hilite assemble: 0.863 ms [perf] hilite doub : 0.335 ms [perf] sum hilite: 11.016 ms [perf] demosaic down : 0.597 ms [perf] demosaic gauss : 0.415 ms [perf] demosaic splat : 0.590 ms [perf] demosaic fix : 0.413 ms [perf] sum demosaic: 2.016 ms [perf] crop main : 0.637 ms [perf] colour main : 0.563 ms [perf] filmcurv main : 0.918 ms [perf] llap curve : 0.314 ms [perf] llap reduce : 0.472 ms [perf] llap reduce : 0.651 ms [perf] llap reduce : 0.632 ms [perf] llap reduce : 0.422 ms [perf] llap reduce : 0.606 ms [perf] llap reduce : 0.441 ms [perf] llap reduce : 0.536 ms [perf] llap reduce : 0.467 ms [perf] llap reduce : 0.415 ms [perf] llap reduce : 2.889 ms [perf] llap reduce : 1.092 ms [perf] llap assemble: 1.030 ms [perf] llap assemble: 2.443 ms [perf] llap assemble: 1.016 ms [perf] llap assemble: 0.851 ms [perf] llap assemble: 1.962 ms [perf] llap assemble: 0.854 ms [perf] llap assemble: 0.849 ms [perf] llap assemble: 2.957 ms [perf] llap assemble: 0.873 ms [perf] llap assemble: 1.016 ms [perf] llap assemble: 2.335 ms [perf] llap colour : 0.884 ms [perf] sum llap: 26.005 ms [perf] f2srgb main : 1.955 ms [perf] total time: 12159.874 ms

Now the error is not showing up... (still an empty window with the GUI command) I don't know what is going on here. It seems related to what is running on the Mac while vkdt is executed. Maybe some missing checks on, idk, available memory...

I try to modify the GUI code to see if something shows up at a certain point.

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/pull/87#issuecomment-1773743314, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKILGOGZJXW7HYE5GTDYAOOH5AVCNFSM6AAAAAA4QZIXNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTG42DGMZRGQ . You are receiving this because you were mentioned.Message ID: @.***>

LucaZulberti commented 8 months ago

On MacOS there is the dtruss utility.

For reference, from Mac recovery mode I removed the SIP protection mechanisms to use dtruss utility: $ csrutil enable --without dtrace

After reboot, in regular session:

$ sudo MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 dtruss ./vkdt &> dtruss_vkdt.log
$ cat dtruss_vkdt.log | grep open > dtruss_vkdt_open.log

The list of open calls is long, I attach the file here: dtruss_vkdt_open.log.

Here is the list of calls that, apparently, are failing:

open("@rpath/libvulkan.1.dylib\0", 0x0, 0x0)         = -1 Err#2
open("@rpath\0", 0x100000, 0x0)      = -1 Err#2
open("/Users/luca/Git/vkdt/bin/Info.plist\0", 0x0, 0x0)      = -1 Err#2
open("/Users/luca/Git/vkdt/bin/Info.plist\0", 0x0, 0x0)      = -1 Err#2
openat(0xFFFFFFFFFFFFFFFE, "/Library/Preferences/Logging/com.apple.diagnosticd.filter.plist\0", 0x1000104, 0x0)      = -1 Err#2
openat(0xFFFFFFFFFFFFFFFE, "/Library/Preferences/Logging/com.apple.diagnosticd.filter.plist\0", 0x1000104, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/test10b/params\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/test10b/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/svgf2/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/denoise/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/ocean/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/ocean/ptooltips\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/ocean/ctooltips\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/y2rgb/params\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/y2rgb/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/nlr/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/spheres/params\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/spheres/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/f2srgb/params.ui\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/loss/params.ui\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/thumb/params\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/thumb/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/ab/params.ui\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/accum/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/inpaint/params\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/inpaint/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-mlv/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-bc1/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-vid/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/svgf/params.ui\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/cnn/params\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/cnn/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-bc1/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/rawhist/params\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/rawhist/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/hist/params.ui\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-pfm/params.ui\0", 0x0, 0x0)        = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/display/params\0", 0x0, 0x0)         = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/display/params.ui\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-v4l2/params.ui\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/srgb2f/params\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/srgb2f/params.ui\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-null/params\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-null/params.ui\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/resize/params\0", 0x0, 0x0)      = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/resize/params.ui\0", 0x0, 0x0)       = -1 Err#2
open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-jpg/params.ui\0", 0x0, 0x0)        = -1 Err#2
hanatos commented 8 months ago

hm thanks this looks all okay to me. so probably no easy way around stepping through the imgui code from render.cc on until it fails.

maybe try to disable thumbnail rendering in the background thread too.

On Sat, 21 Oct 2023, 16:02 Luca Zulberti, @.***> wrote:

On MacOS I can use dtruss. Its use must be enabled in recovery mode with: $ csrutil enable --without dtrace

Then, in normal session:

$ sudo MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 dtruss ./vkdt &> dtruss_vkdt.log $ cat dtruss_vkdt.log | grep open > dtruss_vkdt_open.log

The files are available here:

The failing open calls:

@./libvulkan.1.dylib\0", 0x0, 0x0) = -1 Err#2 @.\0", 0x100000, 0x0) = -1 Err#2 open("/Users/luca/Git/vkdt/bin/Info.plist\0", 0x0, 0x0) = -1 Err#2 open("/Users/luca/Git/vkdt/bin/Info.plist\0", 0x0, 0x0) = -1 Err#2 openat(0xFFFFFFFFFFFFFFFE, "/Library/Preferences/Logging/com.apple.diagnosticd.filter.plist\0", 0x1000104, 0x0) = -1 Err#2 openat(0xFFFFFFFFFFFFFFFE, "/Library/Preferences/Logging/com.apple.diagnosticd.filter.plist\0", 0x1000104, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/test10b/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/test10b/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/svgf2/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/denoise/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/ocean/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/ocean/ptooltips\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/ocean/ctooltips\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/y2rgb/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/y2rgb/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/nlr/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/spheres/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/spheres/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/f2srgb/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/loss/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/thumb/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/thumb/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/ab/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/accum/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/inpaint/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/inpaint/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-mlv/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-bc1/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-vid/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/svgf/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/cnn/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/cnn/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-bc1/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/rawhist/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/rawhist/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/hist/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-pfm/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/display/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/display/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/i-v4l2/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/srgb2f/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/srgb2f/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-null/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-null/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/resize/params\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/resize/params.ui\0", 0x0, 0x0) = -1 Err#2 open_nocancel("/Users/luca/Git/vkdt/bin/modules/o-jpg/params.ui\0", 0x0, 0x0) = -1 Err#2

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/pull/87#issuecomment-1773800676, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKOZG7YNV5LFCTXALPLYAPIYNAVCNFSM6AAAAAA4QZIXNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHAYDANRXGY . You are receiving this because you were mentioned.Message ID: @.***>

LucaZulberti commented 8 months ago

I found the next blocking point!

In gui.c, function dt_gui_render():

VkResult res = vkAcquireNextImageKHR(qvk.device, qvk.swap_chain, 2ul<<30, image_acquired_semaphore, VK_NULL_HANDLE, &vkdt.frame_index);
  if(res != VK_SUCCESS)
  {
    // XXX kill all semaphores
    fprintf(stderr, "--- DBG --- A\n");
    return res;
  }

The debug print always shows up! The window is created but never rendered. I will investigate why it fails to acquire the swap chain image. Do you have any hints for this? Thank you!

LucaZulberti commented 8 months ago

Using QVKR() macro:

[qvk] error VK_SUBOPTIMAL_KHR executing vkAcquireNextImageKHR(qvk.device, qvk.swap_chain, 2ul<<30, image_acquired_semaphore, VK_NULL_HANDLE, &vkdt.frame_index)!
hanatos commented 8 months ago

hm suboptimal is weird but not an error. does the window manager resize the window when it shows?

On Sat, 21 Oct 2023, 18:57 Luca Zulberti, @.***> wrote:

Using QVKR() macro:

[qvk] error VK_SUBOPTIMAL_KHR executing vkAcquireNextImageKHR(qvk.device, qvk.swap_chain, 2ul<<30, image_acquired_semaphore, VK_NULL_HANDLE, &vkdt.frame_index)!

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/pull/87#issuecomment-1773860800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKPEEX32S25KLV3JXTDYAP5H7AVCNFSM6AAAAAA4QZIXNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTHA3DAOBQGA . You are receiving this because you were mentioned.Message ID: @.***>

LucaZulberti commented 8 months ago

I don't know if the manager does that, but for sure I keep the window as it is. I will try to find something inside dt_gui_recreate_swapchain() that is called when the render fails.

UPDATE

I see that ImGUI example does this:

if (g_SwapChainRebuild)
        {
            int width, height;
            glfwGetFramebufferSize(window, &width, &height);
            if (width > 0 && height > 0)
            {
                ImGui_ImplVulkan_SetMinImageCount(g_MinImageCount);
                ImGui_ImplVulkanH_CreateOrResizeWindow(g_Instance, g_PhysicalDevice, g_Device, &g_MainWindowData, g_QueueFamily, g_Allocator, width, height, g_MinImageCount);
                g_MainWindowData.FrameIndex = 0;
                g_SwapChainRebuild = false;
            }
        }

while in dt_gui_recreate_swapchain() Vulkan API is called directly, is it possible that it is missing something Mac-related that in the example is handled by the ImGui_ImplVulkan_* API?

hanatos commented 8 months ago

okay, sorry my code might be a bit optimistic here. in gui.c:312 what happens if you continue as usual in case there was no grave error? i.e. following https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/vkAcquireNextImageKHR.html, test for any of VK_SUCCESS VK_TIMEOUT VK_NOT_READY VK_SUBOPTIMAL_KHR and treat it as "success"?

LDAP commented 8 months ago

@LucaZulberti It seems that the ImGui Vulkan backend can mange its own swapchain, I belive that is unrelated here. The VK_SUBOPTIMAL_KHR case can be handled with something like this:

    if (width != cur_width || height != cur_height) {
        recreate_swapchain(width, height);
    }

    for (int tries = 0; tries < 2; tries++) {
        vk::Result result = context->device.acquireNextImageKHR(
            swapchain, UINT64_MAX, current_read_semaphore(), {}, &current_image_idx);

        if (result == vk::Result::eSuccess) {
            // use swapchain
            return aquire_result;
        } else if (result == vk::Result::eErrorOutOfDateKHR ||
                   result == vk::Result::eSuboptimalKHR) {
            recreate_swapchain(width, height);
            continue;
        } else {
            // handle error case
            return std::nullopt;
        }
    }
    // handle error case (skip update?)
    return std::nullopt;
LucaZulberti commented 8 months ago

Hi @hanatos

okay, sorry my code might be a bit optimistic here. in gui.c:312 what happens if you continue as usual in case there was no grave error? i.e. following https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/vkAcquireNextImageKHR.html, test for any of VK_SUCCESS VK_TIMEOUT VK_NOT_READY VK_SUBOPTIMAL_KHR and treat it as "success"?

I changed gui.c with no success:

if(res != VK_SUCCESS && res != VK_SUBOPTIMAL_KHR)
  {
    // XXX kill all semaphores
    return res;
  }

Hi @LDAP,

@LucaZulberti It seems that the ImGui Vulkan backend can mange its own swapchain, I belive that is unrelated here. The VK_SUBOPTIMAL_KHR case can be handled with something like this:

    if (width != cur_width || height != cur_height) {
        recreate_swapchain(width, height);
    }

    for (int tries = 0; tries < 2; tries++) {
        vk::Result result = context->device.acquireNextImageKHR(
            swapchain, UINT64_MAX, current_read_semaphore(), {}, &current_image_idx);

        if (result == vk::Result::eSuccess) {
            // use swapchain
            return aquire_result;
        } else if (result == vk::Result::eErrorOutOfDateKHR ||
                   result == vk::Result::eSuboptimalKHR) {
            recreate_swapchain(width, height);
            continue;
        } else {
            // handle error case
            return std::nullopt;
        }
    }
    // handle error case (skip update?)
    return std::nullopt;

I tried with:

  VkResult res;
  for (int i = 0; i < 3; i++) {
    // timeout is in nanoseconds (these are ~2sec)
    res = vkAcquireNextImageKHR(qvk.device, qvk.swap_chain, 2ul<<30, image_acquired_semaphore, VK_NULL_HANDLE, &vkdt.frame_index);
    if (res == VK_SUCCESS)
      break;
    else if (res == VK_ERROR_OUT_OF_DATE_KHR ||
             res == VK_SUBOPTIMAL_KHR)
      dt_gui_recreate_swapchain();
    else
      QVKR(res);
  }
  if (res != VK_SUCCESS && res != VK_SUBOPTIMAL_KHR)
    return res;

But the application crashes on second call to vkAcquireNextImageKHR.

...
an error occurred while trying to execute gdb.please check if gdb is installed on your system.
backtrace written to /tmp/vkdt-bt-3895.txt
recovery data written to /tmp/vkdt-crash-recovery.*
hanatos commented 8 months ago

just guessing here: maybe the semaphore get out of sync now? just to rule out weird driver issues, do other simple vulkan programs manage to display a window/content? like the imgui vk/glfw example, or vkcube or some such?

LucaZulberti commented 8 months ago

Hi @hanatos, yes the examples provided with Vulkan and ImGui (glFW+Vulkan example with docking extension) work correctly. I will try to do a more detailed comparison with ImGui to see if something is missing somewhere.

LucaZulberti commented 8 months ago

Hi, I'm still having trouble finding the solution. I just rebased and added fixes to the build process.

hanatos commented 7 months ago

just a random thought: an early version of vkdt was running fine through moltenvk on a macintosh (well with intel hardware then). maybe it's this env var about metal limits that trips it over? if you run other vulkan applications with this env var, do they still work?

also does moltenvk support the validation layers? if you say it crashes could it be that it asserts on a validation error and maybe even has some useful output in the stack trace?

hanatos commented 5 months ago

.. just wanted to ask whether the situation here changed with all the windows/compatibility patches lately? in particular some strange things wrt. -rdynamic and search paths (-d all reports them on the console output now) seem like they would be really similar problems.

hanatos commented 5 months ago

..maybe to add to the -rdynamic discussion. it seems the bs.h style of api opening should work on linux too if we pass 0 as filename to dlopen() to open the main executable. maybe this would even be portable code then (assuming we'll replace the windows stuff through the dlfcn package and dlopen/dlsym as well, the package is there for module loading anyways).

LucaZulberti commented 5 months ago

Hi @hanatos, I still need to rebase the changes. I spent more time processing photos with Ansel, which now builds ok on M1. I will try to update the PR in few days ;-)

LucaZulberti commented 5 months ago

Fast update. I needed to keep a few commits to compiling successfully. Anyway, it is still on a blank screen. I need to investigate more.

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 bin/vkdt -d err -d all -D qvk -D perf ~/Pictures/Temp/ProvaDT/Zulberti.jpg
[pipe] base directory /Users/luca/Git/vkdt/bin
[pipe] home directory /Users/luca/.config/vkdt
[pipe] loaded 76 modules
[gui] glfwGetVersionString() : 3.3.9 Cocoa NSGL EGL OSMesa dynamic
[gui] monitor [0] Built-in Retina Display at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_EXT_metal_surface
[gui] no joysticks found
[gui] no display profile file display.Built-in Retina Display, using sRGB!
[gui] no display profile file display.Built-in Retina Display, using sRGB!
[db] allocating 1024.0 MB for thumbnails
[mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB
[mem] images : peak rss 0.0498505 MB vmsize 0.0498505 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.0997009 MB vmsize 0.0997009 MB
[mem] images : peak rss 144.496 MB vmsize 144.496 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 91.027 MB vmsize 91.027 MB
[mem] images : peak rss 30.5359 MB vmsize 30.5359 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 29.3213 MB vmsize 29.3213 MB
[mem] images : peak rss 0.0498505 MB vmsize 0.0498505 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 0.0997009 MB vmsize 0.0997009 MB
[mem] images : peak rss 144.496 MB vmsize 144.496 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 91.027 MB vmsize 91.027 MB
[mem] images : peak rss 30.5359 MB vmsize 30.5359 MB
[mem] buffers: peak rss 0 MB vmsize 0 MB
[mem] staging: peak rss 29.3213 MB vmsize 29.3213 MB
hanatos commented 5 months ago

great, thanks for rebasing!

from experience with the windowws port, i would try to investigate whether the modules actually load their callbacks (in global.c) and whether any of them is actually called (for instance the stuff in i-raw). another issue was calling back into symbols contained in the main executable (log access from i-raw or checking features via qvk.coopmat_supported or connecting nodes and the likes). i suppose that could be tested completely without gui using the cli too.

On Tue, Jan 23, 2024 at 9:16 PM Luca Zulberti @.***> wrote:

Fast update. I needed to keep a few commits to compiling successfully. Anyway, it is still on a blank screen. I need to investigate more.

$ MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 bin/vkdt -d err -d all -D qvk -D perf ~/Pictures/Temp/ProvaDT/Zulberti.jpg [pipe] base directory /Users/luca/Git/vkdt/bin [pipe] home directory /Users/luca/.config/vkdt [pipe] loaded 76 modules [gui] glfwGetVersionString() : 3.3.9 Cocoa NSGL EGL OSMesa dynamic [gui] monitor [0] Built-in Retina Display at 0 0 [gui] vk extension required by GLFW: [gui] VK_KHR_surface [gui] VK_EXT_metal_surface [gui] no joysticks found [gui] no display profile file display.Built-in Retina Display, using sRGB! [gui] no display profile file display.Built-in Retina Display, using sRGB! [db] allocating 1024.0 MB for thumbnails [mem] images : peak rss 0.00012207 MB vmsize 0.00012207 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 0.000244141 MB vmsize 0.000244141 MB [mem] images : peak rss 0.0498505 MB vmsize 0.0498505 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 0.0997009 MB vmsize 0.0997009 MB [mem] images : peak rss 144.496 MB vmsize 144.496 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 91.027 MB vmsize 91.027 MB [mem] images : peak rss 30.5359 MB vmsize 30.5359 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 29.3213 MB vmsize 29.3213 MB [mem] images : peak rss 0.0498505 MB vmsize 0.0498505 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 0.0997009 MB vmsize 0.0997009 MB [mem] images : peak rss 144.496 MB vmsize 144.496 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 91.027 MB vmsize 91.027 MB [mem] images : peak rss 30.5359 MB vmsize 30.5359 MB [mem] buffers: peak rss 0 MB vmsize 0 MB [mem] staging: peak rss 29.3213 MB vmsize 29.3213 MB

— Reply to this email directly, view it on GitHub https://github.com/hanatos/vkdt/pull/87#issuecomment-1906851709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMAKKKXCYT4BMS2EB2CMIDYQALDXAVCNFSM6AAAAAA4QZIXNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBWHA2TCNZQHE . You are receiving this because you were mentioned.Message ID: @.***>

DorianRudolph commented 4 months ago

I also tried to compile this and I also get a blank window. However, the program crashes if I try to open a raw image.

MVK_CONFIG_USE_METAL_ARGUMENT_BUFFERS=1 bin/vkdt -d err -d all -D qvk -D perf RAW_CANON_5D_ARGB.CR2

The debug build then crashes at the call of qvkDebugMarkerSetObjectNameEXT(qvk.device, &name_info); in dt_graph_create_shader_module from alloc_outputs. The crash is then EXC_BAD_ACCESS in MVKVulkanAPIObject::setDebugName.

The release build crashes with the following error:

[mvk-error] VK_ERROR_OUT_OF_DEVICE_MEMORY: MTLCommandBuffer "vkQueueSubmit MTLCommandBuffer on Queue 0-0" execution failed (code 3): Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault)

Addendum: When debugging, LLDB also pauses at an exception in vkResetFences called from dt_graph_run.

I had to apply the following patch to compile with rawspeed.

diff --git a/src/pipe/modules/i-raw/flat.mk b/src/pipe/modules/i-raw/flat.mk
index 0a90e242..fbce5052 100644
--- a/src/pipe/modules/i-raw/flat.mk
+++ b/src/pipe/modules/i-raw/flat.mk
@@ -5,12 +5,13 @@ RAWSPEED_L=pipe/modules/i-raw/rawspeed/build
 MOD_CFLAGS=-std=c++20 -Wall -I$(RAWSPEED_I)/src/librawspeed/ -I$(RAWSPEED_L)/src/ -I$(RAWSPEED_I)/src/external/ $(VKDT_PUGIXML_CFLAGS) $(VKDT_JPEG_CFLAGS)
 MOD_LDFLAGS=-L$(RAWSPEED_L) -lrawspeed -lz $(VKDT_PUGIXML_LDFLAGS) $(VKDT_JPEG_LDFLAGS)

-pipe/modules/i-raw/libi-raw.so: $(RAWSPEED_L)/librawspeed.a
+pipe/modules/i-raw/libi-raw.$(SEXT): $(RAWSPEED_L)/librawspeed.a

-ifeq ($(CXX),clang++)
+CXX_NAME=$(notdir $(CXX))
+ifeq ($(CXX_NAME),clang++)
 MOD_LDFLAGS+=-fopenmp=libomp
 endif
-ifeq ($(CXX),g++)
+ifeq ($(CXX_NAME),g++)
 MOD_LDFLAGS+=-lgomp
 endif

@@ -43,7 +44,7 @@ endif # end rawspeed
 ifeq ($(VKDT_USE_RAWINPUT),2)
 MOD_LDFLAGS=pipe/modules/i-raw/rawloader-c/target/release/librawloader.a
 MOD_CFLAGS=-Ipipe/modules/i-raw/rawloader-c
-pipe/modules/i-raw/libi-raw.so: pipe/modules/i-raw/rawloader-c/target/release/librawloader.a
+pipe/modules/i-raw/libi-raw.$(SEXT): pipe/modules/i-raw/rawloader-c/target/release/librawloader.a

 pipe/modules/i-raw/rawloader-c/target/release/librawloader.a: pipe/modules/i-raw/rawloader-c/lib.rs pipe/modules/i-raw/rawloader-c/Cargo.toml
    cd pipe/modules/i-raw/rawloader-c; cargo update; cargo build --release

I hope this helps.

DorianRudolph commented 4 months ago

Another finding: If I remove the DEBUG_MARKERS, then the program also runs, and I do get many validation errors. The first is:

[qvk] validation layer: Validation Error: [ VUID-VkPipelineLayoutCreateInfo-descriptorType-03020 ] | MessageID = 0xd67a4ef5 | vkCreatePipelineLayout():  max per-stage storage image bindings count (11) exceeds device maxPerStageDescriptorStorageImages limit (8). The Vulkan spec states: The total number of descriptors in descriptor set layouts created without the VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT bit set with a descriptorType of VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, and VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER accessible to any given shader stage across all elements of pSetLayouts must be less than or equal to VkPhysicalDeviceLimits::maxPerStageDescriptorStorageImages (https://vulkan.lunarg.com/doc/view/1.3.275.0/mac/1.3-extensions/vkspec.html#VUID-VkPipelineLayoutCreateInfo-descriptorType-03020)