Closed dcommander closed 4 years ago
This would be awesome
This functionality is indeed available in the latest nVidia driver, but I don't have it fully working yet. I can access the GPU device through EGL without an X server, create a Pbuffer, and (seemingly) render something to it, but I can't make glReadPixels() function properly, and I'm a little fuzzy on how double buffering and stereo can be implemented, as it seems like EGL doesn't support double buffered or stereo Pbuffers. Emulating double buffering and stereo using multiple single-buffered Pbuffers is certainly possible, but it would greatly increase the complexity of VirtualGL. Waiting for feedback from nVidia.
After discussing at length with nVidia, it appears that there are a couple of issues blocking this:
ISSUE:
SOLUTION:
ISSUE:
POSSIBLE SOLUTIONS:
Simple program to demonstrate OpenGL rendering without an X server:
git clone https://gist.github.com/dcommander/ee1247362201552b2532
Popping the stack on this old thread, because I've started re-investigating how best to accomplish this, and I've been tinkering with some code over the past few days to explore what's now possible, since it's been two years since I last visited it. AFAICT (awaiting nVidia's confirmation), the situation is still the same with respect to EGL, which is that multi-view Pbuffers don't exist. That leaves us with the quandary of how to emulate these GLX features:
glReadBuffer()
, glDrawBuffer()
, glDrawBuffers()
, glNamedFramebufferReadBuffer()
, and glNamedFramebufferDrawBuffer()
(VGL already interposes glDrawBuffer()
) and redirecting GL_FRONT
, GL_BACK
, GL_FRONT_AND_BACK
, etc. to the appropriate GL_COLOR_ATTACHMENTx
target (in the case of GL_FRONT_AND_BACK
, this would require calling down to glDrawBuffers()
.) Fortunately it appears as if it is an error to call glDrawBuffer()
or glReadBuffer()
with a target of GL_BACK
/GL_FRONT
/etc. whenever an FBO other than 0 is bound, so VirtualGL can similarly trigger an OpenGL error if those targets are used without the Drawable FBO being bound.glBindFramebuffer()
in order to redirect Buffer 0 to the Drawable FBO.glGet*()
in order to return values for GL_DOUBLEBUFFER
, GL_DRAW_BUFFER
, GL_DRAW_BUFFERi
, GL_DRAW_FRAMEBUFFER_BINDING
, GL_READ_FRAMEBUFFER_BINDING
, GL_READ_BUFFER
, and GL_RENDERBUFFER_BINDING
that make sense from the application's point of view.glXChooseVisual()
, glXChooseFBConfig()
, and similar functions; and return its own internal structure pointers to the application when the application requests a GLXFBConfig. This is feasible, but it's difficult and fraught with potential compatibility issues.GLX_PRESERVED_CONTENTS
(Hopefully we don't need to? Otherwise, I have no clue), GLX_MAX_PBUFFER_WIDTH
and GLX_MAX_PBUFFER_HEIGHT
(could map to GL_MAX_FRAMEBUFFER_WIDTH
and GL_MAX_FRAMEBUFFER_HEIGHT
), and GLX_LARGEST_PBUFFER
.Features that will likely have to be relegated to the legacy GLX back end only:
glXSelectEvent()
. If we have to use FBOs to emulate Pbuffers, then I'm not sure how to emulate this at all. (Bueller? Bueller?)GLX_EXT_import_context
and indirect contexts in general. EGL has no concept of indirect contexts.GLX_NV_swap_group
. If we have to use FBOs to emulate Pbuffers, then this extension may not be possible to emulate at all.As you can see, this is already a potential compatibility minefield. It at least becomes a manageable minefield if we are able to retain the existing GLX Pbuffer back end and simply add an EGL Pbuffer back end to it (i.e. if a multi-view EGL Pbuffer extension is available.) That would leave open the possibility of reverting to the GLX Pbuffer back end if certain applications don't work with the EGL Pbuffer back end. However, since I can think of no sane way to use FBOs for the EGL back end without also using them for the GLX back end, if we're forced to use FBOs, essentially everything we currently know about VirtualGL's compatibility with commercial applications would have to be thrown out the window. Emulating Pbuffers with FBOs is so potentially disruptive to application compatibility that I would even entertain the notion of introducing a new interposer library just for the EGL back end, and retaining the existing interposers until the new back end can be shown to be as compatible (these new interposers could be selected in vglrun based on the value of VGL_DISPLAY
.)
Maybe I'm being too paranoid, but in the 13 years I've been maintaining this project, I've literally seen every inadvisable thing that an application can possibly do with OpenGL or GLX. A lot of commercial OpenGL ISVs seem to have the philosophy that, as long as their application works on the specific platforms they support, it doesn't matter if the code is brittle, non-future-proof, or if it only works by accident because the display is local and the GPU is fast. Hence my general desire to not introduce potential compatibility problems into VirtualGL. The more we try to interpose the OpenGL API, the more problems we will potentially encounter, since that API changes a lot more frequently than GLX. There is unfortunately no inexpensive way to test a GLX/OpenGL implementation for conformance problems (accessing the Khronos comformance suites requires a $30,000 fee), and whereas some of the companies reselling VirtualGL in their own products have access to a variety of commercial applications for testing, I have no such access personally.
Relabeling as "funding needed", since there is no way to pay for this project with the General Fund unless a multi-view Pbuffer extension for EGL materializes.
I'm thinking about funding this specific project. How do I do that? I'm happy to discuss offline, including the specifics around amount needed, etc. No corporate agenda other than interest in this feature and willingness to fund it (the OpenGL offload without X server). Thanks! Leo Reiter CTO, Nimbix, Inc.
@nimbixler please contact me offline: https://virtualgl.org/About/Contact. At the moment, it doesn't appear that nVidia is going to be able to come up with a multibuffer EGL extension, so this project is definitely doable but is likely to be costly. However, I really do think it's going to be necessary in order to move VGL forward, and this year would be a perfect time to do it.
Pushed to a later release of VirtualGL, since 2.6 beta will land this month and there is no funding currently secured for this project.
Re-tagging as "funding needed." I've completed the groundwork (Phase 1), which is now in the dev branch (with relevant changes that affect the stable branch placed in master.) However, due to budgetary constraints with the primary company that is sponsoring this, it appears that I'm going to need to split cost on the project across multiple companies in order to make it land in 2019.
servertest
, which invokes frameut
and fakerut
with various permutations of VirtualGL settings)
fakerut
uses to communicate with the faker. The faker was previously sending back autotest information to fakerut
using the environment, but that is not thread-safe, and it was causing sporadic crashes in fakerut
's multithreaded test. The faker now exposes special functions that fakerut
can load via dlsym()
to obtain the autotest data, and the faker now stores that data internally using thread-local variables.fakerut
generally more robust across different OpenGL stacks (I am personally able to test against the nVidia proprietary driver, the old fglrx/Catalyst AMD proprietary driver, and the VMWare open source driver.)fakerut
to fail with the fglrx driver (basically legitimate oversights in the faker and fakerut
code that the nVidia driver allowed but the fglrx driver didn't)fakerut
to work around an issue whereby the fglrx driver creates all Pixmaps as single-buffered despite claiming that double-buffered Pixmap-friendly FB configs are availablefakerut
and other unit tests from completing successfully when run on a 2D X server screen other than 0fakerut
Implementing the EGL back end
VGL_DISPLAY
rather than an X display, or by setting VGL_DISPLAY
to egl
vglserver_config
so that it can be used to configure only the EGL mode of operation, for those who would rather not use a 3D X server
vglserver_config
already takes in order to modify the framebuffer device permissions will apply to EGL@nimbixler did you get my e-mail? We could use any funding help you can muster on this.
This is amazing first step, openGL direct rendering without Xserver is a essential feature for HPC world. Let met explain me, Im working with virtualGL/turboVNC/noVNC for deploy a remote visualization service in a cluster HPC across a single node for remote viz, because the other nodes are used for compute mode with Cuda and another tools, What those mean?
If we need to run a Xorg instance for remote viz in a GPU process, this GPU can not be shared for compute mode and X windows system, (the user should be aware of certain limitations with handling both activities simultaneously on a single GPU. If no consideration is given to managing both sets of tasks simultaneously, the system may experience disturbances and hangs in the X Window system, leading to an interruption of processing X-related tasks, such as display updates and rendering.).
Then the hpc world need a separate cluster for HPC, running X 3D server on every node for this service. This isn't good approach, the hardware requirements are very big. Share the same GPU with X windows system and GPGPU compute mode let fusion the both cluster in a single layer. Now the inSITU visualization need this approach for good performance and share resources over the cluster will minimize the costs.
the EGL remote hardware rendering and future webassembly service with h.264 coding are a good combination.
Im sorry for my poor english. :)
I have been looking at WebAssembly in the context of designing an in-browser TurboVNC viewer. So far, it seems to be not fully baked. I've gotten as far as building libjpeg-turbo (which requires disabling its SIMD extensions, since WASM doesn't support SIMD instructions yet) and LibVNCClient into WebAssembly code and running one of the LibVNCClient examples in a browser, but the WebAssembly sockets-to-WebSockets emulation layer doesn't work properly, and the program locks up the browser.
There is a github project that tray to resolve this issue, simd proposal based to SIMD.JS. I Think. 🤔
Regarding the EGL back end, I have currently expended hundreds of hours of labor attempting to make it work with FBOs because nVidia refused to implement a multi-view Pbuffer extension for EGL. I am almost to the point of having to declare failure, which will mean that I cannot seek compensation for a good chunk of that labor. Unfortunately, it just appears that renderbuffer objects and render textures cannot be shared among OpenGL contexts, and that makes it impossible to fully use those structures to emulate the features of an OpenGL window or other drawable. If anyone has any ideas, please post them. I'm desperate.
nVidia suggested a couple of ideas:
I'm still awaiting nVidia's response to my questions. I'm starting to lose hope, however. Most of the funding I secured for this feature was contingent upon successfully implementing it. I am currently at $13,000 worth of un-reimbursed labor on the feature, and if I can't figure out how to implement it, then I may be sunk. I don't have the ability to absorb that kind of loss right now. I normally don't engage in speculative blue-sky projects for exactly this reason, but this is also the first time I've ever encountered a hard technical roadblock like this in my 10 years of independent open source software development. I took a calculated risk that it would be possible to solve all of the problems associated with this feature, but the limitations of EGL may just make that impossible unless nVidia is willing to implement a multi-view EGL extension for Pbuffers (which, thus far, they have expressed great reluctance to do.) The other idea I initially presented in https://github.com/VirtualGL/virtualgl/issues/10#issuecomment-163030995 (using multiple Pbuffers to emulate multi-buffering) is a non-starter, since GLX allows applications to render to multiple buffers simultaneously, and that would be impossible to implement if the buffers were really drawables behind the scenes.
As I have had to implement the feature thus far, the EGL back end is already less compatible than the GLX back end, because there is no obvious way to implement:
glXCopyContext()
(rarely used, but it is part of the GLX 1.0 specification)GLX_EXT_import_context
(also rarely used, but I know of at least one commercial 3D application that uses it)GLX_EXT_texture_from_pixmap
(this is a big limitation, since this extension is used by compositing window managers)Some of those may be possible to implement, but I just can't spend much more time on this. I have to at least get to proof-of-concept stage before I can even get paid for most of the work I've done thus far.
If this feature proves impossible, then that doesn't necessarily mean that VirtualGL is at a technical dead end. There are still proposed enhancements to it that would be meaningful, even with a GLX back end. However, the problem is funding. I only have one source of research funding right now, and this feature has largely exhausted it. Given the seeming impossibility of implementing Vulkan support in VirtualGL (which also, BTW, caused me to lose a potential funding source), the writing is pretty much on the wall. VirtualGL will remain useful for a certain class of application, but I also think we're probably approaching the point at which it will be necessary to implement GPU-accelerated remote display in some other way-- possibly by building TurboVNC upon Xwayland, for instance, and thus implementing hardware-accelerated OpenGL directly within the X proxy. There are probably 100 technical reasons why this wouldn't work, however, and even if it would, it is likely to require hundreds of hours of labor. There's a good chance that it would go the way of this feature, i.e. that I wouldn't discover the impassable technical roadblocks until I was hundreds of hours into the project, thus requiring me to eat five figures of labor cost again. Furthermore, such a feature would have the obvious disadvantage of requiring a particular X proxy in order to achieve GPU acceleration. On the surface, that would seemingly benefit me, since it would drive more users toward TurboVNC, but if other X proxies follow suit, then ultimately it would be a net loss for The VirtualGL Project as a whole, since I would only be receiving funded development on TurboVNC and not on both TurboVNC and VirtualGL.
If nVidia's ideas don't pan out, then I don't know much else that can be done here, short of someone putting pressure on them (and/or AMD) to implement a multi-view Pbuffer extension for EGL.
WIP checked into dev.eglbackend branch: https://github.com/VirtualGL/virtualgl/tree/dev.eglbackend
Just found https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_EGL_image_storage.txt. Will give it a try.
Unfortunately, GL_EXT_EGL_image_storage says that it requires OpenGL 4.2. That may be a show-stopper, since I can't impose that requirement upon OpenGL applications running with VirtualGL. Ugh. The other issue is that I don't think it will be possible to support multisampling with EGLImages, for reasons I described in this thread: https://devtalk.nvidia.com/default/topic/1056385/opengl/sharing-render-buffers-or-render-textures-among-multiple-opengl-contexts/post/5359805/#5359805
At the moment, I consider development on this to be stalled pending further ideas. I'm open to the possibility of using Vulkan if there is a straightforward way to do so, but I have no experience whatsoever with that API, and after extensive googling, I haven't been able to find the information I need regarding how to use Vulkan buffers as backing stores for textures or RBOs.
At the moment, it's starting to appear as if using multiple single-buffered Pbuffers may be the least painful option. Although I can foresee a variety of issues that may prevent that approach from working, I can at least figure out whether it's viable with probably a day or less of work.
Hey @dcommander you got this!
@dcommander Did you figure out if using multiple single-buffered Pbuffers works? What is the current status? Would it be possible to create a first working version which allows to run selected applications e.g. glxspheres64? Will solving this issue also help solve #98? If it would be possible to do visualisations of AI/ML/HPC applications with docker / kubernetes without requiring X11, that would be interesting for a lot of people and may help secure further funding.
I am still trying to secure enough funding to cover my labor to look into the single-buffered Pbuffer approach. (Thank you for the donation, BTW. That certainly does help, and 100% of that money will go toward the aforementioned labor.) I hope to be able to do that work within the next few weeks. I have no idea regarding #98. That is a separate issue, and I haven't had time to look into it. Since that feature isn't specifically funded, my labor to work on it will have to be compensated from the VirtualGL General Fund, which only covers 200 hours/year (shared with TurboVNC.) Since the General Fund is usually exhausted six months into the fiscal year, I have to prioritize its use, and #98 isn't a very high priority right now. My main priority with VirtualGL is to figure out the EGL back end, because if I can reach proof of concept, I can unlock additional funding (which will compensate a lot of the speculative labor I have done already) and testing resources.
The single-buffered Pbuffer approach did not pan out. For a variety of reasons, it would have proven to be a nastier solution than using FBOs, mainly because there was no clean way to implement rendering to multiple buffers simultaneously. GL_FRONT_AND_BACK
may not be particularly commonplace, but depending on the buffer configuration, GL_BACK
, GL_FRONT
, GL_LEFT
, and GL_RIGHT
can also render to multiple buffers. Supporting that functionality would have required a complex, error-prone, and hard-to-maintain automatic buffer synchronization mechanism.
Fortunately, I finally got the information I needed in order to figure out how to use Vulkan to create RBOs backed by non-context-specific GPU memory. I am proceeding down that path and applying for additional R&D funding.
Status update:
Still pursuing the idea of emulating Pbuffers using RBOs backed by Vulkan memory. Will push to the dev.eglbackend branch when I have it working well enough to run GLXspheres. I haven't had a chance to put in much work on it this month due to pressing issues with my other OSS projects.
Funding update:
Total hours spent thus far: 277.6 Estimated hours remaining to productization (slightly hopeful estimate): 60-70 Total: 337.6-347.6
Hours for which funding has already been secured: 167.8 Hours for which funding can be secured upon proof of concept: 71.4 Hours for which funding has been awarded but not yet secured (legal snafu, working on it): 100 Total: 339.2
Update: the aforementioned 100 hours of funding has finally been secured.
Update: while the funding was finally "secured", it hasn't yet been received, so that is currently holding up further development.
The funding was received. This is next in the queue, after some high-priority TurboVNC work that has been promoted to the head of the queue due to the sudden spike in demand for remote work solutions in the U.S.
The Vulkan-based Pbuffer emulator is now building successfully but isn't yet running due to an issue described here: https://forums.developer.nvidia.com/t/sharing-render-buffers-or-render-textures-among-multiple-opengl-contexts/77168/27
So would this enable hardware-accelerated TurboVNC servers without the presence of an underlying X-server?
@MadcowD Referring to the diagrams here, this feature would quite simply eliminate the 3D X server and replace the GLX back end (green arrow) with an EGL back end. When used with the EGL back end, VirtualGL would become a GLX emulator rather than a GLX splitter/forwarder. It's not technically accurate to describe this as a TurboVNC feature, since TurboVNC doesn't technically require VirtualGL and vice versa.
I might have figured out how to make this work using clever manipulation of EGL context sharing. Basically, the idea is (and I've verified that this works at the low level):
I'll keep you posted regarding my progress. Fortunately, the infrastructure to test the solution above was largely already developed in the context of prior failed experiments, so hopefully I can get it prototyped within the next week or two. It's potentially messier, in terms of code, than a Vulkan-based solution would have been, but a Vulkan-based solution appears to be a non-starter because of the fact that nVidia's Vulkan implementation seems to require an X display.
I can't seem to catch a break on this. I was making progress last week, but due to an unforeseen circumstance related to COVID-19, I have to move my office/lab over the next few days (a few weeks ahead of schedule), then I have to do my taxes for next week's deadline and fix some high-priority bugs that were just reported. I promise I'll get back to this research ASAP. I'm doing my best to keep about five balls in the air right now.
The EGL context sharing idea is implemented and builds successfully, and GLXspheres works at the GLX level with no errors. I'm currently trying to sort out the emulation of glDrawBuffer()
and glReadBuffer()
so that GLXspheres will work at the OpenGL level as well (i.e. so it will actually produce an image.) I feel like I'm a few hours away from that, so hopefully I'll be able to declare a proof of concept early this coming week. The next step after getting GLXspheres to work will be getting fakerut to work, then I'll push the code and let people test the pre-release build with their applications of choice.
GLXspheres is working! Lots of work left to do, but the concept seems to be solid.
fakerut is passing all the way through the stereo readback heuristics tests, which means that the concept of multi-buffered Pbuffer emulation using RBOs is resoundingly proven.
Another roadblock, unfortunately. Due to the GLX function call semantics, I was taking the approach of creating a single "RBO context" for every GLXFBConfig and sharing that RBO context with any OpenGL contexts that the 3D application requested to create with that GLXFBConfig. That allowed me to create and swap the RBOs independently of the application-requested contexts, which is necessary to properly emulate glXCreatePbuffer()
and glXSwapBuffers()
. Unfortunately, however, I discovered (experimentally-- I couldn't find any documentation to support this) that the RBO context has the same concurrency limitations as the application-requested contexts. That is, it can only be current in one thread at a time. Thus, I encountered a bunch of OpenGL data races when multiple threads tried to render to independent Pbuffers created with the same GLXFBConfig-- because, even though those threads had their own contexts, all of those contexts were sharing the same RBO context.
Ugh. I'm going to have to ponder how best to work around this problem. Ideas I had:
glXCreate*Context*()
, and we don't know at that point which drawable the application-requested context will be bound to.glXCreate*Context*()
. The application-requested context would actually be created on first use and shared with the Pbuffer-specific RBO context in the body of glXMake*Current()
. However, that's also problematic, because nothing in GLX prevents an application-requested context from being bound to a completely different Pbuffer, and such would require me to somehow unshare the context with one Pbuffer's RBO context and re-share it with another Pbuffer's RBO context.This strikes at the heart of the problem of how to emulate a non-context-specific construct using context-specific constructs. I'm going to have to either limit the EGL back end to single-threaded applications or return to the drawing board. Unfortunately, I'm now 40 hours over funding-- even including the funding that was preconditioned on a proof of concept (meaning that I haven't secured it yet.)
Ignore most of the previous comment. I am sleep-deprived and forgot that shared contexts do not share the actual rendering state. Since my implementation ensures that any access or modification of the shared RBO handles is mutexed, as is any operation involving the RBO context, it seems as if my implementation is not to blame for most of the concurrency issues. I rewrote the multithreaded rendering tests in fakerut using raw EGL, with no shared contexts, and I see the same EGL data races there. I even tried using a completely different EGLDisplay for each thread, and I still see EGL data races. They appear to be unavoidable issues in nVidia's EGL implementation. Thus, I'll try to work around them as much as possible and move forward.
Can you elaborate a bit more on the impact? Will this be a showstopper or do you think you can go ahead with releasing a preview version? Also, is the implementation only working on nVidia GPUs or should it work for other GPUs as well?
Currently only nVidia supports EGL device access. I have contacted AMD and encouraged them to support it as well.
I will still release a preview version. I'm just still experimenting to figure out how best to work around the concurrency issues.
The worst case is that the preview version will not support multithreaded OpenGL rendering at all. I'm hoping I can find a better solution than that, though.
Is multithreaded OpenGL rendering a common usecase in your experience? Or is it more of an exception? Leo
On Thu, Aug 20, 2020, 17:28 DRC notifications@github.com wrote:
The worst case is that the preview version will not support multithreaded OpenGL rendering at all. I'm hoping I can find a better solution than that, though.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VirtualGL/virtualgl/issues/10#issuecomment-677938424, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFHQZPT6IRUKOIENHQG2N3TSBWPPJANCNFSM4BUXKIZA .
To be clear, when I say "multithreaded OpenGL rendering", I don't mean parallel rendering. I'm testing the implementation's ability to render to multiple "virtual windows" (Pbuffers) simultaneously with one OpenGL context per window and also to handle X window resize events that are initiated from a different thread than the rendering thread. I don't have a good sense of whether many applications actually do that, but those tests are mainly a measure of the stability of the implementation.
I went down this rabbit hole because the multithreading tests in fakerut were failing in sporadic ways, including:
eglMakeCurrent()
sometimes returns EGL_FALSE
(but annoyingly, eglGetError()
returns EGL_SUCCESS
when that happens, making it difficult to diagnose the failure.)glClear()
usually fails to clear one of the buffers to the correct color, which causes the rendering correctness check in TestThread::run()
to fail for one or more threads.When I refactored the multithreaded rendering tests using raw EGL and ran the tests through helgrind, I saw multiple data races in libnvidia-glsi and libEGL_nvidia, but neither of the aforementioned symptoms occurred. I was able to isolate (1) above and reproduce it consistently even with a single-threaded case, so I need to solve that problem before I can make any judgment regarding whether the fakerut issues are due to the EGL data races or something else.
Long story short: this is a quickly-evolving situation, so I'll keep you posted once I find out more.
Is there a way to test the driver in a single-threaded application right now?
Yes, the driver is just nVidia's standard driver (I'm using the latest-- 450.xx.) It installs the EGL libraries automatically.
If you mean the EGL back end I'm working on, no. I want it to pass fakerut before I push it for testing.
Well, I think I at least solved (1) (with a 1-line fix.) Turns out that EGL really does not like it if you try to bind a surface to a context in one thread without unbinding it in another thread first. That was apparently the source of the cryptic eglMakeCurrent()
error. Still trying to figure out (2).
The news is better. (2) was a two-pronged bug, and I've managed to fix one prong (a bug in the mapping of external read and draw buffer IDs to RBOs in the EGL back end's emulated version of glXMakeContextCurrent()
.) Still investigating the other prong.
All concurrency issues fixed! Apparently the races in nVidia's EGL implementation were innocuous. The second prong of (2) was a bug in the EGL back end's emulated version of glXSwapBuffers()
. Proceeding with code cleanup and review.
The EGL back end has been pushed to the dev branch and is now available in the dev/3.0 evolving pre-release build.
Care and feeding notes:
vglserver_config
, even on existing VirtualGL servers. Select the appropriate option depending on whether you want to use both the GLX and EGL back ends or just the EGL back end.-DVGL_EGLBACKEND=0
to CMake.VGL_DISPLAY
environment variable or the -d
argument to vglrun
to specify a DRI device path, e.g. /dev/dri/card0.VGL_VERBOSE=1
or vglrun +v
), VGL will print "Opening EGL device {device path}" to the console. This is a convenient way to verify that the EGL back end is in use. You can also temporarily stop the 3D X server if you want to be really sure.Testing I've performed
--leak-check=full
) and helgrind (thread safety checking), running fakerut -nocopycontext
with both the GLX and EGL back ends on my nVidia machineThings that don't work yet:
glXCopyContext()
. I think this can straightforwardly be made to work by borrowing some code from Mesa.GLX_EXT_texture_from_pixmap
-- I need to look into this one, but at first glance, it should be possible. It will probably just require some method of transferring pixels between an EGL Pbuffer surface and a Pixmap on the 2D X server.Things that won't work:
CL_EGL_DISPLAY_KHR
)Refer to the commit log for other notes.
At this point, I have spent approximately 100 hours more than there is available funding for. Many thanks to all who have donated and sponsored this feature thus far. If you have use for this feature and have not donated, please consider doing so. I am obligated to finish the feature on behalf of those who have sponsored it thus far, but I wasn't anticipating having to eat that much labor cost. That overage is due to numerous false starts, including being sent down the garden path vis-a-vis Vulkan (which couldn't work due to the fact that nVidia's implementation requires an X server) and numerous issues I encountered in the process of implementing the feature (including all of the aforementioned concurrency issues-- did I mention that emulating double-buffered and quad-buffered Pbuffers using FBOs is frickin' hard?!)
The good news is that this code is beyond proof-of-concept quality at this point. It's basically beta-quality, minus the two missing features and minus documentation.
That's really great news!
I tested glxgears on my laptop and it worked.
On a server, I got en error though:
$ DISPLAY=:3 vglrun +v -d /dev/dri/card4 glxgears -info
[VGL] Shared memory segment ID for vglconfig: 28901396
[VGL] VirtualGL v2.6.80 64-bit (Build 20200826)
[VGL] Opening EGL device /dev/dri/card4
[VGL] WARNING: Could not set WM_DELETE_WINDOW on window 0x00200002
GL_RENDERER = GeForce GTX 1080 Ti/PCIe/SSE2
GL_VERSION = OpenGL ES 1.1 NVIDIA 418.74
GL_VENDOR = NVIDIA Corporation
GL_EXTENSIONS = GL_EXT_debug_label GL_EXT_map_buffer_range GL_EXT_robustness GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc GL_EXT_texture_format_BGRA8888 GL_KHR_debug GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_semaphore GL_EXT_semaphore_fd GL_NV_memory_attachment GL_NV_texture_compression_s3tc GL_OES_compressed_ETC1_RGB8_texture GL_EXT_compressed_ETC1_RGB8_sub_texture GL_OES_compressed_paletted_texture GL_OES_draw_texture GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_element_index_uint GL_OES_extended_matrix_palette GL_OES_fbo_render_mipmap GL_OES_framebuffer_object GL_OES_matrix_get GL_OES_matrix_palette GL_OES_packed_depth_stencil GL_OES_point_size_array GL_OES_point_sprite GL_OES_rgb8_rgba8 GL_OES_read_format GL_OES_stencil8 GL_OES_texture_cube_map GL_OES_texture_npot GL_OES_vertex_half_float
VisualID 33, 0x21
[VGL] ERROR: in readPixels--
[VGL] 346: GL_ARB_pixel_buffer_object extension not available
$ ll /dev/dri/
total 0
drwxr-xr-x 2 root root 240 May 26 10:31 ./
drwxr-xr-x 20 root root 3940 Aug 25 12:00 ../
crw-rw---- 1 root users 226, 0 May 26 10:31 card0
crw-rw---- 1 root users 226, 1 May 26 10:31 card1
crw-rw---- 1 root users 226, 2 May 26 10:31 card2
crw-rw---- 1 root users 226, 3 May 26 10:31 card3
crw-rw---- 1 root users 226, 4 May 26 10:31 card4
crw-rw---- 1 root users 226, 64 May 26 10:31 controlD64
crw-rw---- 1 root users 226, 128 May 26 10:31 renderD128
crw-rw---- 1 root users 226, 129 May 26 10:31 renderD129
crw-rw---- 1 root users 226, 130 May 26 10:31 renderD130
crw-rw---- 1 root users 226, 131 May 26 10:31 renderD131
I did not run the vglserver_config
on the server, though, after updating vgl. But as I looked into the commit that added EGL, I got the impression that the only thing that was added to the config script was adding write premissions to the DRI devices, which we already have set up. Is there something else that needs to be set?
There are spurious rumors that this either already is possible or will be possible soon with the nVidia drivers, by using EGL, but it is unclear exactly how (the Kronos EGL headers still seem to indicate that Xlib is required when using EGL on Un*x.) As soon as it is possible to do this, it would be a great enhancement for VirtualGL, since it would eliminate the need for a running X server on the server machine. I already know basically how to make such a system work in VirtualGL, because Sun used to have a proprietary API (GLP) that allowed us to accomplish the same thing on SPARC. Even as early as 2007, we identified EGL as a possible replacement for GLP, but Linux driver support was only available for it recently, and even where it is available, EGL still seems to be tied to X11 on Un*x systems. It is assumed that, eventually, that will have to change in order to support Wayland.