termux / termux-x11

Termux X11 add-on application.
https://termux.dev
GNU General Public License v3.0
1.85k stars 290 forks source link

OpenXR - Basic integration for Meta Quest #577

Closed lvonasek closed 1 month ago

lvonasek commented 5 months ago

Introduction

This Pr adds OpenXR support for Meta Quest (2,3,Pro). Using the smartphone Android version in the headset is very hard due to missing controllers support and relative mouse pointer. Intent of this PR is to add full controller support and render the screen using OpenXR (no stereo/6DoF).

How does it work

In the code is detection if it is running on a Meta/Oculus device. The OpenXR is initialized only if the detection indicates it runs on a XR headset. By that said, it means the same APK will be possible to use on mobile and XR. This is possible due to hybrid apps support (Hybrid app == part of the app could be 2D and another XR).

Instead of drawing on a screen, it is rendered into OpenGL framebuffer which is then transformed into a flat screen in XR space. Mouse cursor is still relative but it is mapped on controller translation which works perfectly even in games. Controller buttons are mapped to most common game keys.

Notes

beef-ox commented 5 months ago

Can I help you with this project?

I am a programmer with very little Java experience, but I am a Linux power user and work as a Linux sysadmin; I also have a Quest 2

I also wanted to ask/suggest possibly leveraging wxrd (https://gitlab.freedesktop.org/xrdesktop/wxrd), a lightweight Linux VR compositor based on wlroots, or alternatively xrdesktop-gnome-shell, or xrdesktop-kdeplasma. Unsure if this helps or hinders your development, but these compositors provide a 3D VR environment with free-floating app windows, and controller support

lvonasek commented 5 months ago

Can I help you with this project?

Let me make the basic integration working first. Currently it is just a black screen and it does nothing. I will ping you once I have something working what could be improved.

I also wanted to ask/suggest possibly leveraging wxrd (https://gitlab.freedesktop.org/xrdesktop/wxrd), a lightweight Linux VR compositor based on wlroots, or alternatively xrdesktop-gnome-shell, or xrdesktop-kdeplasma. Unsure if this helps or hinders your development, but these compositors provide a 3D VR environment with free-floating app windows.

I didn't know about xrdesktop, it looks pretty wild. It would be ride to make it working on standalone. I imagine it quite challenging to made that working on Quest but I might be wrong.

twaik commented 5 months ago

I also wanted to ask/suggest possibly leveraging wxrd (https://gitlab.freedesktop.org/xrdesktop/wxrd), a lightweight Linux VR compositor based on wlroots

Termux:X11 is not related to Wayland.

beef-ox commented 5 months ago

@twaik

I also wanted to ask/suggest possibly leveraging wxrd (https://gitlab.freedesktop.org/xrdesktop/wxrd), a lightweight Linux VR compositor based on wlroots

Termux:X11 is not related to Wayland.

It was my understanding that Termux:X11 is an xwayland session. Weston reportedly works, and that is Wayland-based.

@lvonasek Further projects that may prove useful testing these efforts; mesa-zink with turnip and/or virglrenderer are termux-viable projects which enable hardware 3D acceleration

The xrdesktop project also has Gnome and KDE-specific builds that are x11-based (https://gitlab.freedesktop.org/xrdesktop). The wxrd window manager was created to have an extremely small footprint. In all the aforementioned cases, xrdesktop is the underlying platform, which already has movement tracking and controller support. (Hoping "Direct Input" option for touchscreen passthrough could work to pass the controllers and head tracking to Monado without much trouble)

lvonasek commented 5 months ago

@beef-ox It is nice to see there are many opportunities. But until I have the basic integration working, I won't distract myself with other possible stuff. Key to success is to do small steps and do them properly.

twaik commented 5 months ago

It was my understanding that Termux:X11 is an xwayland session

It was first few years. Termux:X11 implemented a small subset of wayland protocol only to make it possible to run Xwayland. But at least year ago project dropped it because of architecture restrictions.

Weston reportedly works, and that is Wayland-based.

Weston works on top of X11 session. It does not need Wayland session to work, it starts wayland session.

The wxrd window manager was created to have an extremely small footprint

wxrd requires wlroots, which requires GLES with some extensions which can not be implemented on Android. Android vendors do not implement support for these extensions and even if they do they are not a part of SDK/NDK and not guaranteed to work. It is a no go.

Hoping "Direct Input" option for touchscreen passthrough could work to pass the controllers and head tracking to Monado without much trouble

You have illusions about how that works. It is implemented only for touchscreen and passes only touchscreen events.

twaik commented 4 months ago

Termux:X11 does not use C++ to keep APK size as small as possible. Currently I am not intended to merge C++ code, only C.

lvonasek commented 4 months ago

Termux:X11 does not use C++ to keep APK size as small as possible. Currently I am not intended to merge C++ code, only C.

Ok, good to know. I will move the XR code to C.

twaik commented 4 months ago

There are a few more things:

  1. You are using GLES3. Currently renderer uses GLES2, and I want to avoid mixing GLES versions in one project.
  2. It seems like you are considering to use swapchain, which will mean blitting image from one frame to another. This solution will have less performance than the main code. Do you receive Surface or SurfaceFrame from OpenXR which will be used to draw on it? I think the best solution will be taking this Surface of SurfaceFrame and pass it directly to X server. But I am not sure how exactly this works.
  3. I can add support for physical gamepad/controller/joystick, but I do not have device (gamepad) for tests. I do not play games that require that. But I can buy one in the case anyone sends funds to buy this (yeah, I buyed one from aliexpress, but it was a piece of ■■■■ and my devices did not even recognized its events correctly).
lvonasek commented 4 months ago
  1. You are using GLES3. Currently renderer uses GLES2, and I want to avoid mixing GLES versions in one project.

I believe I can move to GLES2 completely. GLES3 would be needed if I use stereoscopical rendering using multiview extension.

  1. It seems like you are considering to use swapchain, which will mean blitting image from one frame to another. This solution will have less performance than the main code. Do you receive Surface or SurfaceFrame from OpenXR which will be used to draw on it? I think the best solution will be taking this Surface of SurfaceFrame and pass it directly to X server. But I am not sure how exactly this works.

The swapchain is required by OpenXR. The only way to render in OpenXR is to render into texture and then let headset reproject it. This architecture is very helpful in VR as you can get fluent experience even when not rendering lower framerate than headset's refresh rate. In 2D rendering it doens't bring much benefit but it still needs to be used.

  1. I can add support for physical gamepad/controller/joystick, but I do not have device (gamepad) for tests. I do not play games that require that. But I can buy one in the case anyone sends funds to buy this (yeah, I buyed one from aliexpress, but it was a piece of ■■■■ and my devices did not even recognized its events correctly).

I would like to avoid mapping Meta Quest touch controllers thumbsticks to joystick. The thumbsticks are getting after some time extreme noise. In other XR projects I check if the stick is 70% on right and if so then I send event to right key arrow. But of course we could make it optional at some point.

twaik commented 4 months ago

@lvonasek I am not really sure how exactly it works. How exactly you are intending to extract frames in the activity process? Currently LorieView (which works in context of MainActivity) does not output anything to Surface. It simply passes this Surface to X server process via Binder and X server (which works in com.termux's application sandbox) does all the magic. Of course you can use SurfaceTexture for this, but this solution will use more resources because X root window will be rendered one more time.

lvonasek commented 4 months ago

First, I need to figure out how the rendering in this project works.

Ideally, I would call glBindFramebuffer (binding my XrFramebuffer) and render the frame using OpenGL into it. That way the frame is in OpenXR. In OpenXR, I'll define I want to render it on a plane in 3D space.

It is work in progress and I am new to this repo, please be patient if I commit or say something stupid.

twaik commented 4 months ago

First, I need to figure out how the rendering in this project works.

I explained where exactly Surface is being used so I can explain rendering process too. Renderer is pretty much simple. Right after initialisation of itself X server starts renderer initialisation. It gets jmethodIDs of some Surface related functions, prepares GLDisplay and primitive GLContext and does some checks (to determine if device supports sampling AHardwareBuffer in RGBA and BGRA formats) and creates AHardwareBuffer for root window (if it is supported). After initialisation X server waits for Activity connection. When activity is connected it sends Surface and related data. Renderer initialises new GLContext based on this Surface (in ANativeWindow shape), creates shader and textures for root window and cursor and allows X server draw on there. When server wants to draw screen or cursor is being changed/moved renderer uses shader to draw both root window and cursor textures on current GLSurface and invokes eglSwapBuffers. In the case if device supports sampling AHardwareBuffer of required type root window texture is created with eglGetNativeClientBufferANDROID+eglCreateImageKHR+glEGLImageTargetTexture2DOES, otherwise it is created with simple glTexImage2D and being updated with glTexSubImage2D. Cursor texture is updated with glTexImage2D because I did not meet animated hi-res cursors. But that can be fixed.

Actually this process is pretty much simple. You can reimplement the whole thing in pure vulkan and integrate it to your OpenXR related code.

But. I am not sure why OpenXR context is initialized with JavaVM and global reference of Activity. So I am not sure if it can run completely in X server process. I think I will understand it better in the case you elaborate how exactly that works.

lvonasek commented 4 months ago

I explained where exactly Surface is being used so I can explain rendering process too. ...

Thank you, this is very helpful.

But. I am not sure why OpenXR context is initialized with JavaVM and global reference of Activity. So I am not sure if it can run completely in X server process. I think I will understand it better in the case you elaborate how exactly that works.

For JavaVM I found nowhere any info why is it required. The activity itself is needed for app lifecycle (listening to onWindowFocusChange, onPause and onResume events). I try to elaborate but I am really not good at explaining:

AR/VR headsets have two app modes: 2D (Android apps flying in 3D space) and immersive OpenXR mode. In immersive mode the app cannot render anything using Android API. The only way to show something on screen is OpenGL/Vulkan. Meta recently added support for hybrid apps where you can switch between 2D and XR activity.

I added hybrid app support into this PR and trigger OpenXR runtime only if the app is running on a headset. The final APK will run on regular Android and XR headset(s). Currently it is under construction but in the future I would like to start XR only if the XServer is running (currently there is no way in the headset to go into the preferences or open help page).

beef-ox commented 4 months ago

@lvonasek

With all due respect, I would rather not lose the ability to render as a 2D app on the Quest's home launcher when the x server is displaying 2D content. There should be no need to do that.

x11 is a very important and well-understood protocol. If you want to implement Quest support, I don't think you should be creating a custom, made by you 3D environment to reproject onto, from which all further users of Termux:x11 will then be forced into using, over the Quest's multi-tasking launcher which lets you have 3 2D apps side by side; perfect for my programming workflow for example (and many others)

The goal of Tx11 should be to implement as much of the x11 client protocol as possible, and as close to spec in all respects as possible. The distinction as to whether it should attempt 2D mode vs immersive mode should not be reliant upon the device it is on, but upon whether the x server is attempting to display OpenXR content AND the hardware supports it.

I 100% agree, if the Linux environment is trying to output stereoscopic content over x11, this should indeed display it in immersive mode, but if not, it should display it as a 2D app window. Ideally, this could work like full screen, where the rendering pipeline is direct vs going through a compositor. 2D content displays in a traditional desktop "display" as a 2D app within a WM/DE, but attempting to display XR content would switch to immersive mode to display that content.

lvonasek commented 4 months ago

With all due respect, I would rather not lose the ability to render as a 2D app on the Quest's home launcher when the x server is displaying 2D content. There should be no need to do that.

I will definitely try to make that optional.

lvonasek commented 4 months ago

I still didnt manage to render the frames into the XrFramebuffer. I hoped to reuse the legacy OpenGL rendering but it didnt work for me.

@twaik, is there any chance you could look into it?

If you change XrActivity.java#L66 to return true, you can run it on a phone (rendering the frame into a small blue square). The render code should be called from here: XrActivity.java#L151.

twaik commented 4 months ago

And that is the problem I mentioned before. Rendering is done in different process, which runs in different application sandbox. MainActivity itself is not involved to rendering process, it simply passes Surface to another process (which is actually X server).

twaik commented 4 months ago

And that is a reason why I asked if XR can work without Activity or Context. To make it work in X server process.

lvonasek commented 4 months ago

Now I finally see the problem.

XR cannot work without Activity and Context. It needs to be there when creating XrSession and XrInstance.

To make it work in the X Server process, it would need to be split between two processes which I am afraid of.

twaik commented 4 months ago

Probably in this case you should use SurfaceTexture.

twaik commented 4 months ago

Getting bitmap data is not very wise since you will copy already rendered data from GPU memory to CPU memory and then copy it back. It is a waste of resources.

Look at this code. https://github.com/termux/termux-x11/blob/bcb27e3048bed03983b3fa46cc88e43a2e984542/app/src/main/java/com/termux/x11/MainActivity.java#L208-L222

You can omit mInputHandler lines, but the rest is exactly what you need.

But instead of passing realsurface you should pass there SurfaceTexture. SurfaceTexture is designed to serve as Surface for remote process (X server in that case), but in your process (where XrActivity runs) you will be able to bind SurfaceTexture to GLES texture and draw it where and how you need. And this will let you avoid unnecessary copying from GPU to CPU and back so it will be more efficient.

lvonasek commented 4 months ago

Awesome, thank you. I will implement it during the week.

lvonasek commented 4 months ago

I am afraid it wont work with SurfaceTexture. It has nothing to do with the Surface object. I tried to use HardwareBitmap to avoid copying to CPU but PixelCopy doesnt support copying pixels into HardwareBitmap.

@BiatuAutMiahn and @beef-ox feel free to give it a try.

twaik commented 4 months ago

I am afraid it wont work with SurfaceTexture. It has nothing to do with the Surface object.

Why? You work with surfacetexture in one process (with Activity), attaching it to GL texture with updateTexImage(). And create Surface with Surface(android.graphics.SurfaceTexture) and pass it to X server process, instead of regular Surface created from SurfaceView. It is pretty much easy and well-known pattern.

twaik commented 4 months ago

It works already in the headset with acceptable framerate

acceptable framerate != efficient.

Audio doesn`t work in XR (it works only when Termux is focused)

https://github.com/termux/termux-app/issues/3903

Missing mouse smoothing makes doubleclick almost impossible

I had this problem in termux-x11 before. You should just detect when user presses click button twice and send two consequent click events with no mouse move. Also you should have double click threshold to not send three clicks. That is somewhere in input handling code.

Virtual keyboard support isnt implemented

Virtual keyboard for VR?

lvonasek commented 4 months ago

Why? You work with surfacetexture in one process (with Activity), attaching it to GL texture with updateTexImage(). And create Surface with Surface(android.graphics.SurfaceTexture) and pass it to X server process, instead of regular Surface created from SurfaceView. It is pretty much easy and well-known pattern.

Thank you, I will give it a try. I am not so familiar with working with Surfaces.

Virtual keyboard for VR?

Yes, it works fine without XR but with XR inputMethodManager.toggleSoftInputFromWindow doesnt work (mostly).

https://github.com/termux/termux-app/issues/3903

Meta constantly add & removes background audio support from the OS. For my headset it is currently gone.

twaik commented 4 months ago

I am not sure if it works but it is the first link google gave me. image

I am not sure what the hell is going on with Meta but I think it is wrong. What about all these music players? They must work in background, that is a part of their purpose.

twaik commented 4 months ago

Oh, right, it is Unity. Wth...

twaik commented 4 months ago

I am pretty sure you can inline some IME code to the app, i.e. from Hacker's keyboard or something. But I am not sure I will accept it to master. But it may exist as a separate version. Or maybe I can make it something like build flavor so it will be available as a separate application with the code shared with main Termux:X11 application.

lvonasek commented 4 months ago

The keyboard could be added, it just needs a EditText component and passing the text using TextWatcher.

Background audio is on Quest a sad story. There was a hack to enable it. Afterwards Meta added it as experimental feature and now it's gone.

BiatuAutMiahn commented 4 months ago

I am pretty sure you can inline some IME code to the app, i.e. from Hacker's keyboard or something. But I am not sure I will accept it to master. But it may exist as a separate version. Or maybe I can make it something like build flavor so it will be available as a separate application with the code shared with main Termux:X11 application.

I haven't seen anything better than Hacker's Keyboard, why not go with that?

lvonasek commented 4 months ago

@BiatuAutMiahn Third-party keyboards are not supported on QuestOS. Adding it into OpenXR environment would be a nightmare. I will go with the system software keyboard.

@twaik The SurfaceTexture works great. Thank you for pointing out how to add it.

twaik commented 4 months ago

Do you have any screen recordings? I am not sure how it is called in XR world.

twaik commented 4 months ago

Third-party keyboards are not supported on QuestOS

I did not talk about integrating it to QuestOS. I suggested you to create separate XR surface at some position and draw keyboard there.

lvonasek commented 4 months ago

Do you have any screen recordings? I am not sure how it is called in XR world.

This is from Winlator but it is almost the same (I only use flat screen instead of curved screen): https://www.youtube.com/watch?v=eM1jLcA53ZY (thats the AR mode)

Just like in Winlator, I added the same feature set that you can enable kind of VR mode where the mouse is mapped to the head movement. Someone managed to install Reshade and render it stereoscopic frames side-by-side. I added mapping for left and right eye, so people could enjoy games in real 3D: https://www.youtube.com/watch?v=HK1DYmSui6o

I did not talk about integrating it to QuestOS. I suggested you to create separate XR surface at some position and draw keyboard there.

That can be done but as it is more complex (calculating buttons raycasting, passing the events back into the views, CPU rendering into texture, etc), I will go with system keyboard in this PR.

BiatuAutMiahn commented 4 months ago

Any idea what is required to have a flat app like the native quest apps but with stereoscopy? (eg; Store, TV) I noticed that the TV app shows stereo thumbnails in the main interface and I've yet to see anything that actually does that without invoking full immersive XR/AR. I ask because use cases such as this would be ideal imo.

lvonasek commented 4 months ago

Any idea what is required to have a flat app like the native quest apps but with stereoscopy? (eg; Store, TV) I noticed that the TV app shows stereo thumbnails in the main interface and I've yet to see anything that actually does that without invoking full immersive XR/AR. I ask because use cases such as this would be ideal imo.

That API isn't exposed so it cannot be used in non-system apps.

twaik commented 4 months ago

That API isn't exposed so it cannot be used in non-system apps.

Well, termux-x11 already uses some non-public APIs.

lvonasek commented 4 months ago

That API isn't exposed so it cannot be used in non-system apps.

Well, termux-x11 already uses some non-public APIs.

I am afraid Meta Quest Systém UI APIs are way too restricted to be used in any non-Meta app.

lvonasek commented 4 months ago

I added the virtual keyboard. With Oculess the background audio works.

Now it has the feature set parity with Winlator. Switching between modes is currently using controller keys. If it gets merged into master, I do a follow-up PR to add a nice panel for customizations.

BiatuAutMiahn commented 4 months ago

Works good so far, however I notice that when something crashes and I need to restart Termux:X11, I have to kill both x server and Termux:X11 otherwise inputs don't take. Intermittently the cursor will function but no touch events are being passed.

lvonasek commented 4 months ago

Works good so far, however I notice that when something crashes and I need to restart Termux:X11, I have to kill both x server and Termux:X11 otherwise inputs don't take. Intermittently the cursor will function but no touch events are being passed.

When it happens, could you provide stacktrace/error from logcat? I use it mostly with xfce4 and that runs pretty stable.

BiatuAutMiahn commented 4 months ago

I'll give it a go, is there a place that we can discuss dev like this? Btw I loaded up com.oculus.tv in JADX to poke around and I found an import to com.facebook.react.panelapp.stereoview.StereoViewManager

lvonasek commented 4 months ago

I want to avoid off topic discussions. If the problem is related to this PR then this chat is a good place to go.

lvonasek commented 3 months ago

@BiatuAutMiahn I looked deeper into the freezing issue and it seems to be Mobox integration issue on specific hardware. The support is very different per headset. E.g.: Max Payne 2 works perfectly on Quest 2 but doesnt start at all on Quest 3.

@twaik Could you review the code? I do not want to start anything additional until this part is done.

Here is a video from testing Mobox: https://www.youtube.com/watch?v=HgcnSXwpB1M

twaik commented 3 months ago

Ok, I left two more review comments.

lvonasek commented 3 months ago

Hm, I do not see the comments anywhere

twaik commented 3 months ago

image ...

twaik commented 3 months ago

Also I still do not understand why XrEngineInit needs jvm and activity pointers. What happens if you call it with activity = NULL and activity = NULL && vm = NULL?