Closed smunaut closed 4 months ago
Ok, so digging even further :
Wow. great work! Today I'll jump in and follow your path. In general (afaik) Mesa/EGL is currently in heavy development. Maybe reaching out to Mesa might be a good idea.
I'll use GL_RGBA32F for starters with my code just to see if it breaks the same way. But we need a common example to have a chance. Do you have one?
Proposed fix for the locking issue btw : https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24487/diffs?commit_id=fe844bae481b4f2369ab78537d2a800817973389
As for example, I use the following code. It doesn't really do anything except just trying to create the texture / cl mem object and nothing else ATM.
Thx!
I think i found the problem. __DRI_IMAGE_FORMAT_NONE collides with an actual format. commenting those 2 lines fixes that: https://github.com/Mesa3D/mesa/blob/957009978ef6d7121fc0d710d03bc20097d4d46b/src/gallium/frontends/dri/dri_helpers.c#L679-L680
Now the cl_mem is succesfully mapped.
Anyway that doesn't make a complete patch. I'll create an issue at mesa to maybe get an opinion from someone with the bigger picture.
Not sure what you mean with __DRI_IMAGE_FORMAT_NONE
colliding. I don't see any other image format with the same code.
Here's the flow I traced :
dri2_can_export_dma_buf_image
reports the image can't be exported because
it doesn't have an attribute __DRI_IMAGE_ATTRIB_FOURCC
That attribute comes either from image->dri_fourcc
directly or from the result
of dri2_get_mapping_by_format(image->dri_format)
The image is created by dri2_create_image_khr_texture
This in turn will cann dri2_create_from_texture
to initialize the image properties
AFAICT this will not set image->dri_fourcc
but it will set image->dri_format
And it's set from mapping the gl texture format as such :
img->dri_format = driGLFormatToImageFormat(obj->Image[face][level]->TexFormat);
That method driGLFormatToImageFormat
uses a format_mapping
table to match.
And that table just doesn't have any float32 types in it defined at all. And looking
at the list of available __DRI_IMAGE_FORMAT_xxxx
there are none that would match.
So I guess we'd need to introduce new DRI image formats as the first step.
Apparently going through DRI images and the export dma buf api is the wrong way to do things for interop according to DRI devs I talked to ...
MesaGLInteropEGLExportObject
seems to be the way to implement it AFAICT.
It also seems to return a dmabuf but directly from a GL object handle without any intermediate EGLImage step.
Hmmm. interesting! Gonna be back at it in a couple of days
Implemented a quick PoC replacing eglExportDMABUFImageMESA
with MesaGLInteropEGLExportObject
and it's doing something :sweat_smile:
And it almost looks right : https://i.imgur.com/SmtjQm2.jpg vs https://i.imgur.com/KPFMzAs.jpg
very excellent! sorry i can't help right now. sick in bed
No worries, take care !
FWIW I pushed my current PoC https://github.com/smunaut/compute-runtime/tree/clgl
Note that it depends on some mesa patches https://gitlab.freedesktop.org/246tnt/mesa/-/commits/clgl-export
Those symbols are normally used internally. They can be accessed just using dlsym()
looking for them at the global scope but not in all cases. If you have GLVND (vendor neutral dispatcher) and EGL, then that combination doesn't work. So after talking in the #dri-devel channel, the best option seemed to be to expose those function through the classic GetProcAddress
mechanism and that's what those mesa patches do ...
Found the bug above btw and it's unrelated to the change.
The extension is calling glGetTexLevelParameteriv
to collect data about the texture. but that assumes that the texture is currently the one being bound in the GL state which is not necessarily the case ...
No worries, take care !
FWIW I pushed my current PoC https://github.com/smunaut/compute-runtime/tree/clgl
That looks really promising!
Note that it depends on some mesa patches https://gitlab.freedesktop.org/246tnt/mesa/-/commits/clgl-export
Those symbols are normally used internally. They can be accessed just using
dlsym()
looking for them at the global scope but not in all cases. If you have GLVND (vendor neutral dispatcher) and EGL, then that combination doesn't work. So after talking in the #dri-devel channel, the best option seemed to be to expose those function through the classicGetProcAddress
mechanism and that's what those mesa patches do ...
Found the bug above btw and it's unrelated to the change.
The extension is calling
glGetTexLevelParameteriv
to collect data about the texture. but that assumes that the texture is currently the one being bound in the GL state which is not necessarily the case ...
I have trouble following that last part. Could you point out where? I'd like to fix that if you haven't already.
Note that it depends on some mesa patches https://gitlab.freedesktop.org/246tnt/mesa/-/commits/clgl-export
Any idea when this is planned to be merged? I am just asking because the branch has a over 600 commits :)
The issue is there : https://github.com/intel/compute-runtime/blob/master/opencl/source/sharings/gl/linux/gl_texture_linux.cpp#L57
This can't really call any GL functions because it should be independent of the current GL bind state and modifying it (like binding the texture, querying, restoring) is also very dodgy (I think GL could be doing other stuff in other threads).
The internal format is not an issue, it's returned by the interop API so we can get it from there, but I haven't found any way to get the texture size ...
I've been comparing this code with the windows side of the code, and in that one, the allocation is not grabbed the same way and doesn't seem to need the size. I've been trying to make the linux side more like the windows one but without success, and it might come from the different implementation of the underlying memory manager. I'd love an intel engineer to chime in and explain why the linux side was written like that in the first place. the 'reusesharedhandle' param is also different vs the win side and I'm not sure why either. Same with the "closeSharedHandle" which don't see to be on the win side either.
I've been looking at ROCm a bit too and in there the amd mesa driver actually returns some extra info through the gl interop call AFAICT (there is provision in that interop method to return "vendor" specific data), so we could also have the iris driver return more info from the gl side directly.
The mesa branch only has 3 commits (the 3 latest). It might appear like it has more because it's based off the 23.1.5
tag since I wanted to install system-wide locally and my system has 23.1.5
installed.
I'm not sure when this can be merged or even if this will be merged in that state, the functions might end up with other names. I just opened a draft MR asking for feedback : https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24573
maybe @JacekDanecki could give us a little nodge in the right direction? :)
Ok, I pushed a new version of my branch that doesn't have those issues anymore. I also did a lot of cleanup removing dead/buggy code that wouldn't work on linux anyway pending the reimplementation for linux ( things for GL_BUFFER sharing and synchronization primitives).
This also makes things a whole lot less dependent on EGL which is nice to make it compatible with GLX.
You need my mesa branch to run it obviously.
You need my mesa branch to run it obviously.
It has been merged?
Most of it yes, but the v2 of the interop interface hasn't been merged yet. It's not a patch I wrote, it's part of another MR that's being reviewed now, there was actually quite a bit of review/ack going on yesterday so I have good hope that this will me merged very soon.
In the mean time, you still need to use my 23.2-resolve branch ( https://gitlab.freedesktop.org/246tnt/mesa/-/tree/23.2-resolve?ref_type=heads ).
Thx for the info and your awesome work :)
The support implemented in #166 has some limitations on the supported texture format. I have been discussing in that (now closed) issue the possibility to support more formats but I thought it'd be better to open a new issue rather than keep talking in a closed one.
I've so far removed the quick check that limit the "create memory" to the currently supported texture format and what I noticed is that the call then hangs if you use unsupported textures.
I dug a bit into that and noticed two things :
_eglDestroyImageCommon
is only called ifeglExportDMABUFImageMESA
failed. So it should definitely not hang and it should return the error code instead of hanging but it's also not the root of my problem, ideally that call should work.eglExportDMABUFImageMESA
fails becausedri2_can_export_dma_buf_image
returns FALSE. And that's the root of the issue.dri2_format_table
... but I have no idea if I can just add format to that table or what it implies or whatever ...