immersive-web / webxr-samples

Samples to demonstrate use of the WebXR Device API
https://immersive-web.github.io/webxr-samples/
MIT License
1.01k stars 489 forks source link

Recommended way of extracting background in AR mode #36

Open ArnaudHambenne opened 5 years ago

ArnaudHambenne commented 5 years ago

One of the great advantages of AR is that one is able to extract information from the background in order to perform Computer Vision operations such as feature detection, object tracking, markers etc.

So far, I've been able to identify 3 possible ways to accomplish this, yet none seem to present itself as the way to go:

  1. readPixels()

    onXRFrame(time, frame) {
    let session = frame.session;
    ...
    session.renderState.baseLayer.context.readPixels(...)
    ...
    }

    This yields the correct data, which is the full background imagery without any superimposed 3D models, although cripples performance due to its blocking nature.

  2. Draw onto 2d canvas

    c = document.createElement('canvas')
    ctx = c.getContext('2d');
    ...
    onXRFrame(time, frame) {
    let session = frame.session;
    ...
    ctx.drawImage(session.renderState.baseLayer.context.canvas, 0, 0)
    ...
    }

    Performance is relatively better than in option 1, although performing ctx.getImageData(...) directly after yields an array with only zeroes. This is usually because the preserveDrawingBuffer option is set to the default false, however I have verified this was not the case in every attempt I've made trying to get this to work. Also, that fact that readPixels() works and this does not is quite baffling to me, as they should be reading from the same source, no?

  3. OffscreenCanvas.transferToImageBitmap()

    ...
    let offscreenCanvas = new OffscreenCanvas(...);
    let gl = offscreenCanvas.getContext( 'webgl', { xrCompatible: true });
    ...
    session.updateRenderState({ baseLayer: new XRWebGLLayer(session, gl) });
    ...
    onXRFrame(time, frame) {
    let session = frame.session;
    ...
    session.renderState.baseLayer.context.canvas.transferToImageBitmap();
    ...
    }

    Haven't thoroughly tested this setup for performance, yet my first impression is that it definitely beats option 1. Unfortunately it shares the same issue as option 2: The image is completely blank, no data seems to have been transferred with it.


Given that my goal should not be completely alien to the application of AR, I was wondering if and how I am supposed to retrieve the imagery from the camera captured in AR (in my case legacy-inline-ar) mode in a performant manner. Am I looking in the wrong place? Am I missing something? Given the multitude of samples in this repo, I was hoping to get some insight from folks who are more experienced with this API.

pavan4 commented 5 years ago

Hi @ArnaudHambenne ,

.readPixels() doesn't seem to work as well. What browser did you use and what device?

My code is something like -

  function onXRFrame(t, frame) {
        let session = frame.session;
        let gl = session.renderState.baseLayer.context;
        var pixels = new Uint8Array(gl.drawingBufferWidth * gl.drawingBufferHeight * 4);
        gl.readPixels(0, 0, gl.drawingBufferWidth, gl.drawingBufferHeight, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
        if (pixels.every( (val, i, arr) => val === arr[0] ) ){
          console.log("none");
        }else{
            console.log(pixels);
        }
  }

What did you use to read the pixels?

I am not too worried about the performance hit as I would only like to read the frames at specific intervals. Could you please let me know if you used any other flag to access the pixel values?

ArnaudHambenne commented 5 years ago

Hi @ArnaudHambenne ,

.readPixels() doesn't seem to work as well. What browser did you use and what device?

My code is something like -

  function onXRFrame(t, frame) {
        let session = frame.session;
        let gl = session.renderState.baseLayer.context;
        var pixels = new Uint8Array(gl.drawingBufferWidth * gl.drawingBufferHeight * 4);
        gl.readPixels(0, 0, gl.drawingBufferWidth, gl.drawingBufferHeight, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
        if (pixels.every( (val, i, arr) => val === arr[0] ) ){
          console.log("none");
        }else{
            console.log(pixels);
        }
  }

What did you use to read the pixels?

I am not too worried about the performance hit as I would only like to read the frames at specific intervals. Could you please let me know if you used any other flag to access the pixel values?

Hi. It's been a while since I've worked on this API, although last time I did they had just released a major update stripping all extraction capabilities of the WebXR buffers out of privacy concerns. I'm not sure if they've come around on that, but if they haven't this means that neither readPixels() nor DrawImage() nor TransferToImageBitmap() can be used to extract the pixels from the buffers. All these functions will return an array of zeroes, so viewing any content of the buffers is off the table. I did however read that there was a group of people advocating for some kind of computer vision capabilities, but not sure how far they've come on that. Your code is probably fine.

klausw commented 5 years ago

@ArnaudHambenne is right, the current Chrome immersive-ar implementation does not allow JS-side access to camera pixels. For the future, my understanding is that there should be a separate API or feature that provides pose-aligned camera images, where applications would need to request such access at session start so that the user agent can show appropriate consent prompts.

See https://github.com/immersive-web/computer-vision which mentions this as a use case. Previous discussion was also in https://github.com/immersive-web/webxr/issues/694#issuecomment-501818481 and https://github.com/immersive-web/proposals/issues/36 .

(I also had some thoughts on this in a technical paper at https://www.tdcommons.org/dpubs_series/1902/ . That isn't intended to represent any particular future directions or plans, though I'm still partial to the dinosaur sketches.)

pikilipita commented 3 years ago

Are there any news on this issue? It's a shame no to be able to let the user record a video or take a screenshot of his/her webXR session.

bialpio commented 3 years ago

There have been some efforts to make this happen. Very early prototype is available in Chrome, behind WebXRIncubations flag - the API is described here with a sample usage demonstrated here, but please note that the API is potentially unstable and in no way final.

DaveVaval commented 8 months ago

Any updates as of 2024?

fsereno commented 6 months ago

Any updates? I need to detect/decode a QR code from within a immersive-ar session

FrostKiwi commented 3 months ago

Same desire here!

I'm overlaying data over a device using immersive-ar with ThreeJS and WebXR, works really great, looks awesome, currently using a Meta Quest 3. But manual calibration to overlay model and real-life device is tedious. I want to use QR Codes to calibrate the origin of the ThreeJS coordinate system.