immersive-web / proposals

Initial proposals for future Immersive Web work (see README)
95 stars 11 forks source link

XRCapture Module #68

Open alcooper91 opened 2 years ago

alcooper91 commented 2 years ago

I'd like to propose a module to allow for recording the currently rendered contents of WebXR sessions. I realize this has previously been discussed in Issue #36 , but it seems like secondary-views would not capture all of the scene, and further that it would omit capture of things such as DOM Overlay or XRLayers. (Please correct me if I'm wrong?)

I've drafted an explainer, which can be found here.

cc: @cabanier @toji

/agenda in the hopes of discussing this as well.

cabanier commented 2 years ago

/tpac discuss XRCapture Module

AdaRoseCannon commented 2 years ago

Great TPAC topic!

blairmacintyre commented 2 years ago

I think such a thing would be very useful; akin to how Hololens allows an image capture to take place that synthesizes a view from the forward camera and graphics.

Big privacy implications, so it would need to be gated similarly to camera access. (see @alcooper91's comment below)

alcooper91 commented 2 years ago

It's worth noting, from a privacy perspective, that the API I am proposing would capture an image/recording and save it to disk. The site would not get access to that image/recording unless the user then later chose to upload it to the site. This is similar to if the user just invoked the native runtime mechanisms to take that image, but these aren't always easily accessible when in a Session.

alcooper91 commented 2 years ago

Discussion on this should continue in the next regular call, but may be worth having some discussions here in the meantime.

Summary of key opinions/statements from the TPAC call, apologies in advance if I forget or misrepresent something: @nbutko had a preference for being able to get access to MediaStreams/MediaTracks for mixing the recording (e.g. adding watermarks, custom audio tracks, and encoding to non-webm format) @AdaRoseCannon had slight opposition to the WebShare integration, instead preferring using an opaque id. (I'll note that personally, I feel that the WebShare API integration provides an ergonomics benefit to users/developers, but that given there are existing ways of doing this and other proposals that would smooth this route don't feel too strongly about the benefit) @cabanier seemed to vote in opposition to working on this, had previously asked about whether this could provide hinting for summoning the system UI.

cabanier commented 2 years ago

On both Hololens and Quest, users are already familiar with how they can record sessions. This introduces another way to do the same which might conflict.

I would be more in favor for an API that brings up an OS or UA dialog that gives users the ability to record their session.

nbutko commented 2 years ago

@nbutko had a preference for being able to get access to MediaStreams/MediaTracks for mixing the recording (e.g. adding watermarks, custom audio tracks, and encoding to non-webm format)

It's also worth mentioning that VR headsets would not require additional permissions due to the camera feed.

On both Hololens and Quest, users are already familiar with how they can record sessions.

There are also system level mechanisms for recording screens on iOS and Android. However, we have found that these lack ergonomics around discovery, convenience, and sharing and don't substitute for in-app recording flows. Additionally, watermarks, audio mixing and transcoding are compelling use cases for our current WebAR customers.

cabanier commented 2 years ago

On both Hololens and Quest, users are already familiar with how they can record sessions.

There are also system level mechanisms for recording screens on iOS and Android. However, we have found that these lack ergonomics around discovery, convenience, and sharing and don't substitute for in-app recording flows. Additionally, watermarks, audio mixing and transcoding are compelling use cases for our current WebAR customers.

I'm doubtful that we can run a WebXR session at acceptable performance if we do "watermarks, audio mixing and transcoding" at the same time. UAs can certainly bring up their own dialog on platforms where recording is not or poorly supported but otherwise, it should be preferred.

alcooper91 commented 2 years ago

One of the stated goals is also to be able to capture DOMOverlays and XRLayers; where DOM overlays could theoretically embed iFrames and thus there would still be some privacy restrictions. Given this, I think there still needs to be some opaque way of capturing a recording.

I believe that requiring encoding in a more portable format (mp4) should mitigate some of the need for transcoding; and it would seem that it should be possible to draw a watermark while the image is being recorded (if you know that such a recording is happening?)

Apart from ease of use/discoverability, I think that this mechanism can also provide the ability to initiate the captures without forcing the user almost out of the session (as would be required on handheld or fullscreen devices), as well as providing the page hints that the image capture/recording is done so that it can prompt if the user would like to upload/share the recording.

Absent an API like this, I don't know that UAs have a mechanism to bring up a dialog to do such a recording.

cabanier commented 2 years ago

One of the stated goals is also to be able to capture DOMOverlays and XRLayers; where DOM overlays could theoretically embed iFrames and thus there would still be some privacy restrictions. Given this, I think there still needs to be some opaque way of capturing a recording.

Capturing the session with domoverlay and layers by the OS, should be safe since no third parties can have access to them.

I believe that requiring encoding in a more portable format (mp4) should mitigate some of the need for transcoding; and it would seem that it should be possible to draw a watermark while the image is being recorded (if you know that such a recording is happening?)

Recording type and quality should be decided by the UA. Why would you want to watermark the output? Is it so the site can restrict distribution?

Apart from ease of use/discoverability, I think that this mechanism can also provide the ability to initiate the captures without forcing the user almost out of the session (as would be required on handheld or fullscreen devices), as well as providing the page hints that the image capture/recording is done so that it can prompt if the user would like to upload/share the recording.

Absent an API like this, I don't know that UAs have a mechanism to bring up a dialog to do such a recording.

If on Quest, you hit the oculus button during an immersive session, you get the option to record it without being thrown back to 2D. I assume Hololens has a similar mechanism; do you know @fordacious @RafaelCintron?

alcooper91 commented 2 years ago

One of the stated goals is also to be able to capture DOMOverlays and XRLayers; where DOM overlays could theoretically embed iFrames and thus there would still be some privacy restrictions. Given this, I think there still needs to be some opaque way of capturing a recording.

Capturing the session with domoverlay and layers by the OS, should be safe since no third parties can have access to them.

To clarify, this was a point against MediaTracks/Streams

Apart from ease of use/discoverability, I think that this mechanism can also provide the ability to initiate the captures without forcing the user almost out of the session (as would be required on handheld or fullscreen devices), as well as providing the page hints that the image capture/recording is done so that it can prompt if the user would like to upload/share the recording. Absent an API like this, I don't know that UAs have a mechanism to bring up a dialog to do such a recording.

If on Quest, you hit the oculus button during an immersive session, you get the option to record it without being thrown back to 2D. I assume Hololens has a similar mechanism; do you know @fordacious @RafaelCintron?

Right, I'm thinking about mobile AR scenarios that don't have easy things like this, plus having a lower-friction surface to do so available (e.g. one or two taps, rather than summoning a menu). The native functionality that I'm comparing this with in SceneViewer allows the app to simply have a button that takes a picture/video with no prompt.

nbutko commented 2 years ago

Given this, I think there still needs to be some opaque way of capturing a recording.

Canvas taint provides a good existing model for this.

cabanier commented 2 years ago

If on Quest, you hit the oculus button during an immersive session, you get the option to record it without being thrown back to 2D. I assume Hololens has a similar mechanism; do you know @fordacious @RafaelCintron?

Right, I'm thinking about mobile AR scenarios that don't have easy things like this, plus having a lower-friction surface to do so available (e.g. one or two taps, rather than summoning a menu). The native functionality that I'm comparing this with in SceneViewer allows the app to simply have a button that takes a picture/video with no prompt.

We could have an API that on Quest/Hololens opens the system menu but on AR devices, it brings up a confirmation prompt or dialog rendered by the UA (or no dialog at all in case of a picture). It would be confusing to our users to have 2 separate ways to do screen recordings which would negate the lower-friction part.

alcooper91 commented 2 years ago

I don't think it would be confusing to users to have 2 separate ways to do screen recordings. I think there's plenty of examples of apps that do their own camera integration (e.g. snapchat and even facebook messenger provide ways to change up your camera feed with a "take picture" button), and the capture API for HoloLens allow developers to write this custom kind of capture experience as well: https://docs.microsoft.com/en-us/windows/mixed-reality/develop/platform-capabilities-and-apis/mixed-reality-capture-for-developers#integrating-mrc-functionality-from-within-your-app

I think it would be more confusing to developers to have two different APIs to initiate capture, one which summons a system UI (if such a thing is even available/present), and one that would go through a UA prompt with the implementation being based on what the runtime supports. The UA can still choose to invoke their system UI when capture is requested, but I think there are some potential issues from a developer expectation POV if the developer only requested a screenshot and the user then changes to a video, and the developer doesn't have a way to stop the video.

cabanier commented 2 years ago

We can't really control apps that do their own thing. They are free to record the screen and use it however they want.

I think it would be more confusing to developers to have two different APIs to initiate capture

I'm not proposing that there are 2 APIs. I want 1 API that invokes the system capabilities if they are available or that invokes a UA dialog (if needed) if there are none.

I think there are some potential issues from a developer expectation POV if the developer only requested a screenshot and the user then changes to a video, and the developer doesn't have a way to stop the video.

Why would it not OK for the user to record a video if they choose to do so? Are you envisioning that the experience changes if it detects that it's being recorded?

the developer doesn't have a way to stop the video.

We could provide an API to stop recording if there's a reason for the experience to have control over that.

alcooper91 commented 2 years ago

We can't really control apps that do their own thing. They are free to record the screen and use it however they want.

I'm not sure I understand what you're saying here? I was pointing to those apps as examples of things that expose separate ways to invoke screenshots as cases where an API like this wouldn't be out of place in allowing pages to build their own recording experience.

I'm not opposed to the UA showing the system API if that's the UA's choice; but I don't want to change the expectation of the app on if it's taking a screenshot or a video out from under them.

Recording is inherently more expensive than a simple screenshot so if the app knows it's being recorded, it may want to scale back the quality of some of the models or intentionally throttle it's own frame rate, perhaps there are animations that it would want to sync up to the start of a recording. Further, the app may want to, once it knows the capture has finished, switch/enable a "share" button (which whether that invokes the file picker->Web Share API or a method on the capture object is a separate point of debate). Most significantly, I'd already proposed the app having control over the "stopRecording" function so that they could have their own UI around that, styled to be consistent with their experience; if an app is expecting (because it only ever requests) a screenshot, but then the user has started a recording, it may not have a stop recording button, and forcing such a requirement feels like it would make the API less attractive to developers, and add a further burden on UAs to add some form of "stop recording" button, which could potentially muddy up the UX by adding extra interaction points. (UAs are still free to programatically terminate the recording if they feel a site is abusing the recording length).

alcooper91 commented 2 years ago

/agenda I believe part of the TPAC followup was to discuss this in a call.

cabanier commented 2 years ago

We discussed this in today's WG meeting but I didn't feel like we got closer to a consensus. To clarify, these are the features that I believe the API should have:

The API should NOT:

I'm also a bit uneasy about your suggestion that the user can immediately share the recorded session. In case of Android and Hololens, that would show the user's environment and it seems that there should be some type of warning before that goes out.

nbutko commented 2 years ago

We are advocating for consistency of experience across Desktop 3D (canvas, no camera feed), Mobile WebAR (canvas, camera feed), and headset (WebXR). Particularly, we would like to take existing flows and allow for them in headset sessions.

In the current flow,

We do not require capturing the full DOM as a requirement. In fact, including the record button in the video is undesirable.

In a headset session, it's possible that this means only recording from one layer.

alcooper91 commented 2 years ago

I think we're mostly in agreement on the items you mentioned with two points that I'd like to clarify.

I don't think that

The API should NOT:

  • present a UA drawn configuration dialog box.

should be an explicit requirement. (Unless you meant mandate instead of present), as I think it should be a UA choice whether they invoke system UI or their own Dialog box. (e.g. in-line with not enforcing if the UA/system does the recording, a given platform may not have a system recording mechanism and so the implementation may be done by the UA).

I'm fine with the requirement that we:

Allow for the UA to decide on quality, encoding, image type, etc.

with the caveat I mentioned today, where I think the broader type of recording (e.g. photo vs video) shouldn't be changed out from under the page, as otherwise the page may not be showing proper UI to stop the recording. The UA/System likely does still need to provide a mechanism to stop the recording though, to prevent abuse by sites that could simply not call a "stop" button.

It does push a little bit of additional burden onto the developers and I'm not sure how I feel about it, but if needing to change the type was a critical path, we could modify the return type to indicate which type of recording was started so that the page could respond appropriately, and still provide a good user experience. (I think essentially this amounts to collapsing my proposed XRCapture/XRVideoCapture interfaces and adding an enum).

As far as immediately sharing the recorded session, I think it's more accurate to say that it allows a kick-off of the WebShare API, which doesn't allow the site to influence the share targets, and is essentially the same as invoking the native "Share" functionality on an object. However, with that being said, that proposed integration is more for developer convenience than anything else, and is something I'd like to explore further once we have agreement that speccing an XRCapture API is something that we should move forward with.

For Nick, if you don't require capturing DOM, then your use case should be met once UAs are able to enable Raw Camera Access (albeit as Rik mentioned yesterday, this may not be the most performant); one of the key requirements that I've been targeting with this API is that there are other developers for whom capturing their DOM Overlay elements is a requirement.

toji commented 2 years ago

In a headset session, it's possible that this means only recording from one layer.

Headsets are the only devices that implement layers at this point, but there's no reason to expect that they'll always be unavailable to mobile AR. True, they don't provide as many benefits in that environment, but normalizing the API across form factors where possible is a good goal.

nbutko commented 2 years ago

Headsets are the only devices that implement layers at this point, but there's no reason to expect that they'll always be unavailable to mobile AR. True, they don't provide as many benefits in that environment, but normalizing the API across form factors where possible is a good goal.

Perhaps I should have written: In sessions with multiple layers, perhaps this means only recording from one layer.

as Rik mentioned yesterday, this may not be the most performant

Performance is a key requirement here. If there is a performant way to capture a single layer, it would go a long way.

nbutko commented 2 years ago

As an example of what's currently supported and expected by developers on the web, this is the 8th Wall Media Recorder API, which is widely used by 8th Wall's developers:

https://www.8thwall.com/docs/web/#xr8mediarecorder

nbutko commented 2 years ago

Here's an example of that API in action: https://www.8thwall.com/alivenow/freefire

You can take a photo or record a video. Recorded videos include overlayed 2D UI elements and a custom end card.

Example recording:

https://user-images.githubusercontent.com/25936010/142264116-00fb9896-0c89-4b77-821c-ab780d886fbb.mp4

nbutko commented 2 years ago

There are other developers for whom capturing their DOM Overlay elements is a requirement.

I would try to assess what the true product requirement is here -- is it truly to represent all DOM elements including passwords, credit card numbers in stripe iframes and other sensitive fields? Or is it a mechanism for mindfully injecting specific 2D content on top of the video? I would expect the latter, since this is the common use case we see, and handle.

cabanier commented 2 years ago

In a headset session, it's possible that this means only recording from one layer.

Headsets are the only devices that implement layers at this point, but there's no reason to expect that they'll always be unavailable to mobile AR. True, they don't provide as many benefits in that environment, but normalizing the API across form factors where possible is a good goal.

Users would expect that a capture will show the entire scene. If the author used a media layer for video and an equirect or cube layer, it would be strange if those weren't recorded.

cabanier commented 2 years ago

The API should NOT:

  • present a UA drawn configuration dialog box.

should be an explicit requirement. (Unless you meant mandate instead of present), as I think it should be a UA choice whether they invoke system UI or their own Dialog box.

Yes, the UA can choose to show a dialog when record or capture is called. What I meant was that the API shouldn't mandate a method that invokes a dialog and returns a list of options to the author that are used later for capturing.

Capturing must definitely show something to the user to make sure that their experience isn't recorded secretly.

I'm fine with the requirement that we:

Allow for the UA to decide on quality, encoding, image type, etc.

with the caveat I mentioned today, where I think the broader type of recording (e.g. photo vs video) shouldn't be changed out from under the page, as otherwise the page may not be showing proper UI to stop the recording. The UA/System likely does still need to provide a mechanism to stop the recording though, to prevent abuse by sites that could simply not call a "stop" button.

We also need to consider what should happen if the system was already making a recording. The site should not be able to turn that off or detect it.

It does push a little bit of additional burden onto the developers and I'm not sure how I feel about it, but if needing to change the type was a critical path, we could modify the return type to indicate which type of recording was started so that the page could respond appropriately, and still provide a good user experience. (I think essentially this amounts to collapsing my proposed XRCapture/XRVideoCapture interfaces and adding an enum).

That would cover the case If the UA asks the system to do a screen grab but the user backs out of that option and elects to record instead. I'm leaning towards marking such a thing as a failure instead of asking the page to react to it.

As far as immediately sharing the recorded session, I think it's more accurate to say that it allows a kick-off of the WebShare API, which doesn't allow the site to influence the share targets, and is essentially the same as invoking the native "Share" functionality on an object. However, with that being said, that proposed integration is more for developer convenience than anything else, and is something I'd like to explore further once we have agreement that speccing an XRCapture API is something that we should move forward with.

OK, that's reasonable.

alcooper91 commented 2 years ago

We also need to consider what should happen if the system was already making a recording. The site should not be able to turn that off or detect it.

Similar to the issue you later mention:

That would cover the case If the UA asks the system to do a screen grab but the user backs out of that option and elects to record instead. I'm leaning towards marking such a thing as a failure instead of asking the page to react to it.

The user could also back out of granting permission to take the capture or recording, so I think ensuring that all of those cases (capture type changed, user declined permission when prompted, capture is ongoing), are all reported as the same type of failure to the page, the page will know that a capture did not start, but not necessarily why a capture failed to start.

We are advocating for consistency of experience across Desktop 3D (canvas, no camera feed), Mobile WebAR (canvas, camera feed), and headset (WebXR).

Any such WebXR capture API would be available across the supported session types (albeit there's likely some runtime implementation delta), so I don't think that needs to be a concern here; unless you aren't intending to use inline sessions for the Desktop 3D case, which opens a whole different set of worms, as any such API to unify those recordings would be out of scope of the Immersive Web Group (likely falling under WebRTC), and similar APIs that give access to the streams that you can manipulate have, as I understand it, met with push back from various browser vendors in that group.

I would try to assess what the true product requirement is here -- is it truly to represent all DOM elements including passwords, credit card numbers in stripe iframes and other sensitive fields? Or is it a mechanism for mindfully injecting specific 2D content on top of the video? I would expect the latter, since this is the common use case we see, and handle.

@elalish is the primary user I've spoken to that cares about capturing DOM elements; specifically (IIUC), he is interested in capturing the DOM elements that have been incorporated into the scene (which I believe often includes hotspots/annotations for models); but does not care if the site that is hosting the content actually ever gets access to them. I think an API to do this (which has parity with native features/capturers), is fundamentally different than the features that you want, since you want more access to manipulate the recording, but don't care about an increased user friction to do so.

As I've stated before, if your use case does not involving capturing the DOM, and requires accessing streams that allow you to manipulate the recording directly, all of that is (or will be once Raw Camera Access ships) available to you today, since the camera feed is the only content that you don't control. Accessing a stream that contains the DOM Content requires a much higher privacy bar, that quite honestly, I haven't been able to devise suitable mitigations for Android to allow shipping getDisplayMedia (relevant chrome bug) yet, and we'd need similar mitigations for this API.

@cabanier, is exposing a capture mode that would expose the raw streams (barring privacy mitigations), something that it would be technically feasible for you all to expose either? It sounds like your thoughts for implementing this API would be to hook directly into your system-level capturers, which would have similar privacy concerns since system UI and similar would be exposed, and you likely couldn't implement a mode that would strip out any DOM content?

If we set aside the potential privacy issues a moment and say that we are able to come up with a satisfactory solution that would grant you access to streams, I think such an enhancement would be possible to be plugged into the API shape I propose, but even still, given that a mechanism to do what you want exists today, we'd likely still want and prioritize a more privacy-preserving API that has less user and developer friction before implementing such an enahancement.

elalish commented 2 years ago

Indeed, I've been pushing for this feature because it's the last major gap between what WebXR can do and what SceneViewer (the native app) can for AR (try it here). One of their most-used features is the record button, which people seem to use a lot for making silly pictures/videos of people next to AR renders. Still, it's nice for commerce too (send your partner a picture of the sofa you're considering placed in your living room). However, it's a rather jarring experience if what you see is not what you get. Consider that we use DOM elements for things like showing dimensions; it's a surprising screen recording if it's not actually recording the whole screen. The smoothness of the flow is also key; WebXR permissions are a huge block already, so the last thing we need to is make the experience even bumpier.

nbutko commented 2 years ago

Currently recording with 8th Wall's recorder allows overlay of 2D content by drawing to a foreground canvas (usually with a 2D context) which gets composited over the 3d canvas. This provides a lot of flexibility and sounds like it could be used effectively to annotate hotspots, etc. without the inherent security risks of fully general dom capture.

elalish commented 2 years ago

The whole point of this API is to keep the page from having access to the recording at all to bypass these security issues. The user can see exactly what is being recorded, so they have all the info to know whether to share it or not. Note exactly the same thing is available on Android phones (and it'll capture the DOM too), there's just not a way to trigger it from the web yet. The whole point of WebXR is to close the gap with native.

cabanier commented 2 years ago

@cabanier, is exposing a capture mode that would expose the raw streams (barring privacy mitigations), something that it would be technically feasible for you all to expose either?

No, that's not really feasible. For one, non-projection layers are completely handled by the system compositor that lives in a separate process. This means that we wouldn't be able to capture what the user sees. For regular projection layers it may seem that they are easy to capture. However, the system compositor runs reprojection on the content that the browser produced which is also not something that the browser can do.

It sounds like your thoughts for implementing this API would be to hook directly into your system-level capturers, which would have similar privacy concerns since system UI and similar would be exposed, and you likely couldn't implement a mode that would strip out any DOM content?

We don't support DOM Overlay but are planning to implement DOM Layers. We wouldn't want to strip out that content because it's part of the experience. An argument could be made that we shouldn't capture system UI though.

If we set aside the potential privacy issues a moment and say that we are able to come up with a satisfactory solution that would grant you access to streams, I think such an enhancement would be possible to be plugged into the API shape I propose, but even still, given that a mechanism to do what you want exists today, we'd likely still want and prioritize a more privacy-preserving API that has less user and developer friction before implementing such an enahancement.

I prefer that we start with a simple API that doesn't involve streams. As I mentioned, I suspect that they will be too resource intensive to be useful on headsets.

alcooper91 commented 2 years ago

We don't support DOM Overlay but are planning to implement DOM Layers. We wouldn't want to strip out that content because it's part of the experience. An argument could be made that we shouldn't capture system UI though.

I should mention, my question was at wondering if it would even be possible for you all to implement a capture type that didn't expose DOM content and returned a stream. I think it's probably okay, in the opaque recording case, to capture system UI or at the very least to just mimic/directly interface with whatever the system recording capabilities can do; but I think it's also not something we'd want to strictly spec.

cabanier commented 2 years ago

We don't support DOM Overlay but are planning to implement DOM Layers. We wouldn't want to strip out that content because it's part of the experience. An argument could be made that we shouldn't capture system UI though.

I should mention, my question was at wondering if it would even be possible for you all to implement a capture type that didn't expose DOM content and returned a stream. I think it's probably okay, in the opaque recording case, to capture system UI or at the very least to just mimic/directly interface with whatever the system recording capabilities can do; but I think it's also not something we'd want to strictly spec.

To capture everything the user sees with an in-browser stream would be extremely hard.

alcooper91 commented 2 years ago

I think that we have enough agreement that this is something we should spec and most of the remaining disagreement is about the shape, which may be better served by being in it's own repo/addressing specific parts of the spec?

@cwilso @AdaRoseCannon or @Yonet can we move my readme/explainer from the initial post into a new immersive web repo? Or is there further agreement we think we need to solicit here?

alcooper91 commented 2 years ago

Friendly ping on this after the short week for @cwilso @AdaRoseCannon or @Yonet , what are the next steps here?

Yonet commented 2 years ago

Hi @alcooper91 thank you for the friendly ping. I'll agenda to make sure there are no objections. Also we need In the meantime, I can create the repo. Should we call it XRCaptureModule? Are planning to be one of the contact person? We need two people. Thanks!

/agenda XR Capture Module to move into new repo, who would like to be the contact person?

alcooper91 commented 2 years ago

Yes, I can certainly be one of the contact people; XRCapture Module seems reasonable, but if we want to mention in a meeting we can also defer creation of the repo until then to see if anyone has a better idea.

AdaRoseCannon commented 2 years ago

/agenda let's take another look at making a repo for this people have been asking about it a lot recently.

tangobravo commented 2 years ago

Hi all - just seen this pop up on the call agenda, and about to hop on my flight to SF for AWE, so can't make this call - hopefully see some of you in person there.

@nbutko has covered this already, but coming from a mobile-first point-of-view, the web already has performant hardware-accelerated canvas capture, and that's one of the things you lose when switching from a gUM-based "WebAR" setup to a WebXR one. I understand this particular request is about capturing more than just the canvas, and so perhaps is more of a headset-targeted API at this point.

Raw camera access + existing canvas capture would get to parity with WebAR capabilities; perhaps nice to opt-in to camera-access after the session has started to limited the permissions to only when the user taps to record.

alcooper91 commented 2 years ago

I wouldn't say that this API is meant to be only headset focused; but it's meant to provide consistency for a capture experience that can be used both on mobile or in a headset. It's further designed so that you can do this capture without needing to request raw camera access, and thus be a bit more privacy preserving; since the only camera stream you would receive with this would be for the duration of the recording. (Though I do acknowledge that yes, for mobile Raw camera access should enabling recording since you already control everything else being composited).

cabanier commented 2 years ago

+1 on the need for this API. Taking a screenshot today in WebXR will cause a noticeable drop in framerate and there is no easy way for the site to save it to the screenshots folder. This feature would solve both these problems (although I suspect that it will still might cause some temporary pressure on the system)

AdaRoseCannon commented 2 years ago

Discussed in todays meeting, resolution to make repo:

name: webxr-capture description: Capture composited content, layers and real world through a privacy preserving high level API points of contact: @alcooper91 @cabanier

Note: If anyone has a better name or description please add it below

himorin commented 2 years ago

@AdaRoseCannon @alcooper91 @cabanier https://github.com/immersive-web/capture https://immersive-web.github.io/capture/

just added skelton-ish files, as cg-report of IWCG. explainers etc. are blank.