immersive-web / webxr

Repository for the WebXR Device API Specification.
https://immersive-web.github.io/webxr/
Other
2.98k stars 381 forks source link

Garbage collecting objects that are generated every frame #1010

Closed asajeffrey closed 4 years ago

asajeffrey commented 4 years ago

The webxr API generates some objects every frame, e.g. the frame and pose objects, which needs the GC to be able to efficiently free them. This is straightforward in a generational GC, we just hint to the GC to run a minor GC at the end of each rAF. The problem is that any per-frame objects that are still live at the end of the rAF will be tenured, so after enough rAFs there will be a major GC, which causes frame rates to drop while it's running. The worst-case scenario for this is if user code keeps access to some of the previous frame's objects, e.g. keep a pointer to last frame's pose data. Then each minor GC will cause an object to be tenured.

asajeffrey commented 4 years ago

A possible API fix is to make all objects that are generated every frame expire and lose their data at the end of the frame. So user code which wanted last frame's pose data would do so by copying the data out of the pose object and into a long-lived object. But not all per-frame objects expire at the end of the frame.

asajeffrey commented 4 years ago

For three-generation garbage collectors this isn't a problem, since objects only become tenured if they survive moving from G0 to G1 then to G2 (=tenure), so they have to have survived quite a few GCs of G0 before getting to G2. But re-engineering a GC to support three generations is non-trivial!

MortimerGoro commented 4 years ago

We are hitting some stutter frames due of this in Firefox Reality on Quest. I captured some profiles using the immersive WebXR sample.

gc_before_graph

gc_before_fps

gc_before_12ms

All the peaks are related to GC.

After implementing some pooling strategy in a local branch graph looks much better:

gc_after_graph

gc_after_fps

asajeffrey commented 4 years ago

Unfortunately object pooling doesn't help with the problem of per-frame object getting tenured (unless you are recycling objects from the pool while they're still live in JS, in which case their properties will change nondeterministically).

MortimerGoro commented 4 years ago

yes, the pooling strategy is not totally spec compliant (but I think it's pretty safe for some specific short lived objects, which are not supposed to be held for a high number of frames).

Expiring all the perf-frame objects at the end of the frame would make things easier.

asajeffrey commented 4 years ago

It's, er, suboptimal that the best way to implement the spec efficiently is to ignore it :) I can imagine it being a bit of a problem for testing too, pose data that is held on to for N frames is still correct, but at N+1 frames it nondeterministically changes!

cabanier commented 4 years ago

Is there a way to "return" poses and frames to the XRSession?

For instance, what if we create a function on session that takes a pose or frame. The session can then take it as a hint that it can re-use them.

It's not very elegant but it would get the job done... Has anyone checked if other UA's have a similar problem?

asajeffrey commented 4 years ago

Yes, we could do that, though it does mean that well end up with an API where some objects are recyclable, and some expire. This is more complicated than it needs to be, and will make tracking down space leaks quite tricky. There aren't many other APIs with explicit recycling, the nearest I can think of is WebGL.

blairmacintyre commented 4 years ago

when I was playing with an API for exposing camera frames via webxr, I used this approach (explicit return) for the frames themselves, and in the end, although those are much bigger objects; I needed to do that because the frames might be needed arbitrarily long (since I was processing them in a worker using a WASM opencv blob, something I'd expect to be common).

For data that isn't needed as long, and isn't expensive to copy, I'd prefer to see us simply say "these values are only valid for the frame" and then reuse them. So, avoid GC and return. I realize this exposes a bunch of other potential bugs and issues.

asajeffrey commented 4 years ago

For objects that really are long-lived, it's fine for them to be tenured, it's the short-lived ones that are generated every frame but live for two frames that are the problem, because they'll end up being tenured and triggering a major GC.

toji commented 4 years ago

A possible API fix is to make all objects that are generated every frame expire and lose their data at the end of the frame.

This is already the case for XRFrames, which become inactive at the end of a frame and will throw exceptions if you call any methods on a held reference. Does that type of pattern allow for the GC behavior you're talking about, or is something more needed?

asajeffrey commented 4 years ago

Yes, getting all of the short-lived objects to follow the pattern of XRFrame would make it a lot less likely that user code would hold on to objects and cause major GCs.

toji commented 4 years ago

In terms of other short-lived (roughly per-frame) objects, the primary thing we're talking about is XRPoses, right? That feels tricky, as anything we do to cause them to invalidate outside the scope of a frame seems like a pretty big change. Plus there are plenty of cases where devs may legitimately want to hold onto a pose between frames for things like velocity calculation.

I'm worried about backwards compatibility when talking about any changes like this, but I would be interested in knowing if there's any existing precedent with web APIs for a pattern where you have to either do an explicit copy of a piece of data or otherwise "pin" it or else it will self destruct outside the current scope.

I guess the flip side to that is having a destroy() method that at least lets the developer opt-in to better memory behavior, but I've only seen those in cases where the underlying resources are significant and the stuff we're talking about here is small but numerous, so it doesn't strike me as the right pattern.

cabanier commented 4 years ago

Will destroy() help? The JS object itself will still be there and ref counted so GC will still happen.

Maybe we can pass the previous pose in then if the function can recycle it, it could repopulate it with new data and return it.

asajeffrey commented 4 years ago

@cabanier destroy will help if the spec allows destroyed objects to be recycled. It's still not ideal, better would be to get the GC able to do its job.

@toji I think the way that other APIs handle this is by not generating objects every frame, e.g. the window rAF use integers for everything, and (IIRC) WebGL doesn't require object creation during the event loop. IIRC WebVR was designed to not generate objects every frame as well.

I do agree with you re backwards compatibility and not breaking existing content. Perhaps the spec can ask UAs to issue a warning to the console log if XRPose objects are used outside the frame they were created in? Plus a non-normative note about the GC issues? That would at least give developers a clue why they are encountering GC issues.

cabanier commented 4 years ago

@cabanier destroy will help if the spec allows destroyed objects to be recycled. It's still not ideal, better would be to get the GC able to do its job.

I think we'll get pushback if we create a method and spec text specifically to deal with GC behavior. Can weak references solve this problem more elegantly? From the proposal:

The WeakRef class is tailored for the web’s event-loop based programming. A dereferenced WeakRef referent is guaranteed to be alive until the end of the current turn of the event loop.

Maybe it's too late for WebXR to change this, but we can add it to the Layers spec since it creates even more objects in each rAF.

asajeffrey commented 4 years ago

Weak references would indeed do the job, but there's a long history with weak references (and anything else which is about exposing GC semantics to JS) at TC39. Weak references are a draft, so I'm not sure we should be using them in specs yet?

For the layers spec, we can just make all the per-frame objects expire at the end of the frame, which will strongly discourage users from hanging on to them!

There shouldn't be pushback about adding non-normative text about GC should there? I can imagine getting pushback if we had a normative requirement that the UA MUST log a warning about long-lived XRPose objects. It's a tricky trade-off though, if users only test their content on one browser, and that browser doesn't warn about GC (e.g. because it uses a 3-generation scheme) then user content may space leak on other browsers.

cabanier commented 4 years ago

Weak references would indeed do the job, but there's a long history with weak references (and anything else which is about exposing GC semantics to JS) at TC39. Weak references are a draft, so I'm not sure we should be using them in specs yet?

It's about to ship in Chrome

asajeffrey commented 4 years ago

The FF issue for it: https://bugzilla.mozilla.org/show_bug.cgi?id=1561074

I asked in https://github.com/tc39/proposal-weakrefs/issues/202 whether we should be making use of weak references.

asajeffrey commented 4 years ago

Over in https://github.com/tc39/proposal-weakrefs/issues/202#issuecomment-622487764, @littledan said

I'd recommend avoiding depending on the observability of GC in general, and especially in other specifications, per https://w3ctag.github.io/design-principles/#js-gc

The W3C TAG API guidelines explicitly call out returning weak refs as an anti-pattern.

littledan commented 4 years ago

I'm still confused about how you think WeakRef or FinalizationRegistry would help. They will also be influenced by generational GC, and not get triggered immediately.

Have you discussed this with Mozillians who have been involved in the WeakRef design process? Maybe they could help you better than I could. cc @codehag @hotsphink @tschneidereit @jonco3 @dbaron @annevk

asajeffrey commented 4 years ago

(@cabanier is at Oculus/Facebook, not Mozilla)

hotsphink commented 4 years ago

Can weak references solve this problem more elegantly?

No, they can't. See below.

From the proposal:

The WeakRef class is tailored for the web’s event-loop based programming. A dereferenced WeakRef referent is guaranteed to be alive until the end of the current turn of the event loop.

I think you're misunderstanding this text. The referent is guaranteed to be alive until the end of the turn. Nothing guarantees that anything will be dead. Restated, it is "The referent is guaranteed to be artificially kept alive until at least the end of the current turn." It is almost the opposite of what you want.

And in practice, no implementations that I know of will kill things off sooner due to the existence of weakrefs to them. (It's not totally out of the question, but it wouldn't be what you want anyway -- it would be done with major GCs, so would probably result in more pauses, not less.)

asajeffrey commented 4 years ago

it would be done with major GCs, so would probably result in more pauses, not less.)

Ah, do minor GCs not collect weak refs? In that case, I retract my comment that weak refs would solve the problem. My reluctance to use weak refs in webxr specs still stands.

hotsphink commented 4 years ago

it would be done with major GCs, so would probably result in more pauses, not less.)

Ah, do minor GCs not collect weak refs?

Maybe I'm misunderstanding, because I don't see how this matters. How would you use weakrefs here?

hotsphink commented 4 years ago

I may have confused things with that parenthetical note about killing things off sooner. That was for a hypothetical situation where with the SpiderMonkey GC we have weakrefs with cross-compartment targets. It is theoretically possible that if all allocation happens in compartments other than those of the targets (eg the compartment with the weakrefs), then we would never collect the targets. That might cause cross-browser compatibility issues (despite being spec-compatible), so we could pessimize by choosing to collect nonallocating compartments if they contain weakref targets. Slower GCs, but more prompt finalization. But this is all hypothetical, is unrelated to what's going on here, and would be more likely to make your situation here worse rather than better. I was being pedantic, because I was claiming weakrefs never result in more prompt finalization, and that bothered me because it's not 100% true. Ignore me.

asajeffrey commented 4 years ago

@hotsphink the goal is to ensure that the per-frame objects generated by webxr don't trigger a major GC. The problem is that a common use case is that user code hangs on to those objects for 2 frames, so it can compare last frame's data with this frame's. This is fine in a three-generation GC, but is pretty much the nightmare scenario for a two-generation GC, since every minor GC will end up tenuring an object, eventually triggering a major GC.

There are a number of proposals for how to fix this, e.g. expiring all the data in the object (which makes the object useless after a frame, so user code is unlikely to keep it live), or issue a console warning if a per-frame object gets tenured, or @cabanier's suggestion to use weak refs.

littledan commented 4 years ago

(In general, we sort of expect all JavaScript implementations to have leaks/situations where they never collect something that feels logically dead. These are expected to differ across engines. Adding WeakRefs to JS included making the tradeoff that we're willing to eat this interop risk.)

asajeffrey commented 4 years ago

@littledan yeah, this is a hard problem, especially if we add a constraint of not breaking existing webxr content :(

hotsphink commented 4 years ago

@hotsphink the goal is to ensure that the per-frame objects generated by webxr don't trigger a major GC. ...etc...

Right, that's the part I understand. The problem makes sense to me.

or @cabanier's suggestion to use weak refs.

That's the part I don't understand. How would you use weak refs to help?

cabanier commented 4 years ago

or @cabanier's suggestion to use weak refs.

That's the part I don't understand. How would you use weak refs to help?

We want to make clear to the author that they are making a mistake by holding onto these objects. We're planning on adding console warnings if they hold onto these temporary objects but they might ignore them. But if holding onto weak refs will actually cause errors in their implementation (by random GC's), they will have to fix their code.

hotsphink commented 4 years ago

So the idea is create a weak ref to these objects during the same frame that they were created, and then check in the next frame if they have been collected?

cabanier commented 4 years ago

So the idea is create a weak ref to these objects during the same frame that they were created, and then check in the next frame if they have been collected?

No. We just want to pop up a warning (which we can do regardless of weak refs) and set it up so people get an error if they do hold onto them because the object will randomly expire.

Manishearth commented 4 years ago

Reading through the whole discussion, I'm wary of yielding weak refs to the user here.

XRFrames are useless outside of a frame, and it feels okay to allow UAs to pool them. I'm more wary about XRPoses, someone may wish to hold on to them for longer periods of time to get reference points.

Mentioning that it might be good to run GC after the frame might be good, though

RafaelCintron commented 4 years ago

For historical context: Back in the WebVR days, you allocated a frame object and passed it to the browser API to fill in.

The WebVR group received pushback from people outside of the group on this pattern. The crux of the pushback was that this is an anti-pattern on the web and that garbage collectors should be able to handle small amounts of per-frame objects without negatively affecting performance.

This pushback was what ultimately led to the WebXR group re-adding per-frame objects.

Maksims commented 4 years ago

that garbage collectors should be able to handle small amounts of per-frame objects without negatively affecting performance.

Sadly, there are many APIs can do this, and eventually there is a lot of amounts of per-frame objects, which eventually impacts performance.

If it is possible to avoid allocating anything for realtime applications - it is always a best thing to do.

Almost any engine and WebGL app developers will pre-allocate, and re-use objects where it is possible. It might be not web pattern. But it is definitely a realtime applications pattern, under which WebGL and WebXR falls in.

MortimerGoro commented 4 years ago

that garbage collectors should be able to handle small amounts of per-frame objects without negatively affecting performance.

They mentioned that when we only had one VRFrameData with internal arrays per frame but in WebXR we have many more objects created per frame. e.g. A simple WebXR demo easily creates all these objects per frame:

That's almost 30 new objects per frame. At 72hz in Oculus Quest it's about 2160 objects created per second. I don't think it's a trivial amount of objects created to still choose a "web pattern" over a realtime o graphics application pattern.

IMO the easiest solution would be to get all of the short-lived objects to follow the pattern of XRFrame (expire at end of frame) and make the users make a explicit copy if they want to hold them. Problem is that is not backwards compatible. But I haven't seen held XRViewerPose or XRPose objects in any of the XR demos I have tried, so maybe we are still on time to make such a change.

Manishearth commented 4 years ago

I think at a minimum we can allow the caching of XRViewport and XRFrame. Bit wary of caching XRViewerPose and XRView but it might be okay?

cabanier commented 4 years ago

A UA can also choose to allocate objects lazily so it only creates the ones that are used. That should end up creating far less than 2000 objects per frame.

I think at a minimum we can allow the caching of XRViewport and XRFrame. Bit wary of caching XRViewerPose and XRView but it might be okay?

Yes, if you call it with the exact same arguments on the same object, it might work. Maybe we should make that change in a trial build and run some tests to see if any experiences fail.

Manishearth commented 4 years ago

Yeah, for example an application only using matrices probably won't need DOMPoints to be constructed

RafaelCintron commented 4 years ago

@MortimerGoro and @Maksims I sympathize with your concerns as I was one of the people originally pushing for more object reuse in WebVR.

Adding @bfgeek and @esprehn to the conversation as I know they felt strongly about the issue back then.