Closed steven-michaud closed 4 years ago
For what it's worth, I did a successful tryserver build with my patch in the Mozilla tree:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=01ec6ad5751f958e1766fb20addb6731d9142f13
That's a very interesting patch! And an impressive investigation :)
I'm concerned that flushing may negatively affect execution on GPUs that are not affected by the bug. This makes me wonder why the original patch was landed without any kind of blocklist. It sounds like at least limiting it to Intel macOS machines would help?
For WebRender, we typically first add a record to our bug database here - https://github.com/servo/webrender/wiki/Driver-issues. I'll be happy to add this one on behalf of you.
Then we land add some sort of a workaround in WebRender. It can detect it and apply the workaround, or it can expose an option in RendererOptions
that Gecko fills in based on blocklists. In any case, the workaround should he handled by WebRender, not gleam.
On the Mac, flushing already takes place (though at the wrong time) on every call to fDeleteTextures()
, regardless of graphics hardware. So there shouldn't be any performance problem. This shown by the following several lines from the output of my dtrace script at https://bugzilla.mozilla.org/show_bug.cgi?id=1666293#c3. GLEngine-gleUnbindTextureObject
and libGPUSupportMercury.dylib-gpusSubmitDataBuffers
aren't graphics hardware specific. They're used on all major releases of macOS going back at least to 10.12.
libsystem_kernel.dylib`mach_msg_trap+0xa
IOKit`io_connect_method+0x17f
IOKit`IOConnectCallMethod+0xf4
IOKit`IOConnectCallStructMethod+0x23
IOAccelerator`IOAccelContextSubmitDataBuffersExt2+0xfd
libGPUSupportMercury.dylib`gpusSubmitDataBuffers+0x88
AppleIntelHD5000GraphicsGLDriver`IntelCommandBuffer::getNew(GLDContextRec*)+0x30
AppleIntelHD5000GraphicsGLDriver`intelSubmitCommands+0xb2
GLEngine`gleUnbindTextureObject+0x3a
GLEngine`gleUnbindDeleteHashNamesAndObjects+0x9f
GLEngine`glDeleteTextures_Exec+0x2d4
XUL`mozilla::WebGLTexture::~WebGLTexture()+0x106
XUL`mozilla::WebGLTexture::~WebGLTexture()+0xe
Working around Apple's bug in WebRender will probably be less simple than doing it here, but I'll look into it.
Also, I'm not completely sure that my patch won't also help on other kinds of graphics hardware. As per https://bugzilla.mozilla.org/show_bug.cgi?id=1666293#c4, I'll be waiting over the next few weeks to see what effect it has on crash signatures containing the string "gpusGenerateCrashLog".
These can be generated by many different bugs, which only have in common that they're Apple graphics driver bugs.
Well, that's great news for sure! So perhaps, adding this bug to known list of bugs, and doing the gl.flush()
call on WebRender side behind cfg(target_os = "macos")
is all we need.
But that would need to be done in many different places, as per https://bugzilla.mozilla.org/show_bug.cgi?id=1666293#c13.
Maybe I just need to add code to WebRender through which all calls to fDeleteTextures
must be routed, and add the workaround there.
Once I've figured out what I'm doing, I'll add a section to https://github.com/servo/webrender/wiki/Driver-issues. Should it be at the bottom?
Maybe I just need to add code to WebRender through which all calls to fDeleteTextures must be routed, and add the workaround there.
That sounds good. Hopefully, it's not going to be too invasive.
Should it be at the bottom?
Yes, appending to the bottom is fine.
I can do something in WebRender. But I really think it'd be much better to do the workaround here. Please tell me why you think we shouldn't.
Reason is consistency. So far, gleam
was not a place to put API workarounds in. It's just a wrapper around GL.
We've been putting all the workarounds in WebRender, so it's hard to make a case for an exclusion here.
Thanks for letting me know. I still don't like it, but I understand your reasoning. I'll see what I can manage in WebRender.
After landing my patch for https://bugzilla.mozilla.org/show_bug.cgi?id=1666293, I discovered that it would be bypassed by all calls to
gl.delete_textures
from Rust code in the Mozilla tree. So those calls are still effected by Apple's bug, for example here. As best I can tell, this patch will clear up the problem.I'm submitting my patch here because the gleam crate, though included in the Mozilla code tree, is third party code.