mozilla / standards-positions

https://mozilla.github.io/standards-positions/
Mozilla Public License 2.0
614 stars 69 forks source link

performance.memory #85

Closed past closed 5 years ago

past commented 6 years ago

Request for Mozilla Position on an Emerging Web Specification

Other information

The main goal of this specification is to "provide developers a mechanism to identify releases of their site that regress memory usage". Chrome has implemented an older proposal that has significant issues with privacy and accuracy. Some Google properties already use this API for capturing in-the-wild memory usage data to help:

There has been some related discussion recently in blink-dev.

Open questions

  1. Do people want fast measurements or detailed measurements (that may jank the browser)?
  2. How to achieve: zero space/time overhead when not queried, O(1) performance in allocated memory when queried, and accurate measurements.

Some notes about a potential implementation:

One explicit non-goal for this specification is comparing memory measurements between different browsers.

marcoscaceres commented 6 years ago

@past, anyone in particular your think would be good to get input from? Do you have a personal opinion about this API?

froydnj commented 6 years ago

@glandium, @nnethercote, or @EricRahm may have comments surrounding platform specifics.

Regarding open questions, it looks like the API (at least for memory.privateMemoryFootprint) was designed for zero overhead when not being queried, and constant-time performance when queried. The API also looks like it was designed for fast measurement; detailed reporting appears to have been discarded as too difficult to implement cross-platform. So jankiness shouldn't be a problem.

Regarding implementation notes, we could probably reuse some of the infrastructure that about:memory or Telemetry's memory reports rest on, but having to cache the memory reports shouldn't be necessary.

glandium commented 6 years ago

The API pretty much relies on the fact that Chrome uses one process per site.

EricRahm commented 6 years ago
  1. The language in Appendix D is definitely chrome dependent.
  2. I'm also not convinced that memory should be reported per frame. It seems like you'd want the memory usage for your overall page, including any iframes but that's debatable.
  3. Roughly equating overall process memory with overall page memory is a bit sketchy. For instance some Firefox content processes will contain activity stream info, but not all. That memory has nothing to do with the page.
  4. Excluding all shared memory might be a mistake. I agree that ignoring bits shared by the OS makes sense, but maybe not ignoring mmappings for graphics. A possibly flawed example might be a site that makes heavy use of canvases, you'd probably want that memory overhead reported even if it's backed by shared memory.

@glandium do you have any feedback on the proposed implementation as far as terminology and the portions they plan on measuring (including the pseudo-code). For example I'm not sure using PrivateUsage is the right thing on windows (so commit charge rather than resident-unique)

glandium commented 6 years ago

I don't think the details of what they want to measure matter much, because well before that, the assumption that the process<->site model is that of Chrome pretty much kills it as a useful cross-browser standard from my POV.

bzbarsky commented 6 years ago

I would like to urge everyone to read the thread at https://groups.google.com/a/chromium.org/forum/#!searchin/blink-dev/memory$20api/blink-dev/no00RdMnGio/hA9SZ3kmBAAJ

The basic problem with the initial proposals there was that they were basically tied to V8's memory stuff.

The current proposal is sort of an attempt to define an API that satisfies the three properties I listed at https://groups.google.com/a/chromium.org/d/msg/blink-dev/no00RdMnGio/b1Wvn2_uBwAJ. You're right that it assumes a particular process model; a process model that no browser actually ships right now. This is somewhat better for interop than assuming a particular JS implementation, especially if browsers were to converge on that process model.

That said, I think the Google folks would welcome suggestions for ways to improve this stuff and get it to the point where it could be standardized. If we have concrete suggestions we should offer them...

annevk commented 6 years ago

Given https://github.com/mozilla/standards-positions/issues/81#issuecomment-380328972 I suspect at least some Mozillians would object to making (parts of) the web platform dependent on a particular architecture. It doesn't seem ideal to me either.

EricRahm commented 6 years ago

@annevk I don't see the link with CORB. Finding a memory measurement that is useful cross-browser/cross-platform is pretty much a lost cause. Each browser has a different engine, each browser defines a different algorithm for how many sites are in a content process (if at all), each OS defines different rules for how memory is allocated, freed, and cached.

Defining a consistent way for sites to perform measurement that is useful even if it's only good for comparing in a given browser on a given platform seems like an okay goal. This is a huge win for pushing triaging memory regressions from browser developers (read: me) to the sites themselves.

glandium commented 6 years ago

I would like to urge everyone to read the thread at https://groups.google.com/a/chromium.org/forum/#!searchin/blink-dev/memory$20api/blink-dev/no00RdMnGio/hA9SZ3kmBAAJ

That thread was instructive, because it wasn't clear to me what this spec was meant to achieve.

The current proposal is sort of an attempt to define an API that satisfies the three properties I listed at https://groups.google.com/a/chromium.org/d/msg/blink-dev/no00RdMnGio/b1Wvn2_uBwAJ.

AIUI, those properties are what prompted for the API to rely on system metrics. OTOH, using system metrics breaks most of the interesting points in the requirements:

Context free: Actions that a user takes on other sites or on other applications should not affect the memory footprint of the site calling the API.

Without the Chrome process model, that can't be fulfilled with system metrics.

Comprehensive: Should account for all memory allocated on behalf of the web page, including memory required by the browser's internal structures.

System metrics don't allow to have this information for internal structures used on behalf of the web page in other processes.

Actionable: (...)

With "Context free" being out of the picture outside Chrome, I'm afraid getting a signal from the aggregate is only possible for sites the size of Facebook or Gmail (both mentioned in the thread). That's good for such large sites, but less so for the web at large.

Definition consistent on all supported platforms.

I'm not sure system metrics actually allow consistency here.

Now, the problem is that, if not system metrics, then what? I'm afraid the answer is that you can only pick one of the properties you listed. Maybe two if you're lenient. I guess the question becomes what are the possible tradeoffs to make this useful to more than Facebook and Gmail in Chrome. I don't have an answer to that, but I'm not convinced this spec does that.

bzbarsky commented 6 years ago

Finding a memory measurement that is useful cross-browser/cross-platform is pretty much a lost cause.

Part of the problem is that any API like this that exists across browsers will get used for cross-browser comparisons. :(

I'm afraid the answer is that you can only pick one of the properties you listed. Maybe two if you're lenient.

Yeah, that's about where I was...

bzbarsky commented 6 years ago

System metrics don't allow to have this information for internal structures used on behalf of the web page in other processes.

Oh, and this is an important point, and I think we should raise it with the Chrome folks now. I hadn't thought of this, and it does make the entire "use the system metrics for the process" approach not really right even in the best case.

glandium commented 6 years ago

Some more thoughts. It seems this (old) blog post covers the use cases for this: http://www.html5rocks.com/en/tutorials/memory/effectivemanagement/. There are essentially two things in there:

The latter, as hinted in my previous comment, may require a lot of data, even if the measures are perfect.

Part of the problem is that any API like this that exists across browsers will get used for cross-browser comparisons. :(

All of the above tempts me to say half jokingly an API returning a "bag of json, consult your browser vendor" might actually be a better option.

amccreight commented 6 years ago

froydnj kind of alluded to this, but I thought I'd point out that in Firefox we could actually report a good chunk of per-site memory, including JS and DOM (including things like memory entrained by arrays and the like), with little additional overhead, and without assuming that a process only contains a single website. This is because we're periodically running our memory reporter for telemetry purposes anyways, and it gathers that information (If you look at about:memory in Firefox, you can see what is being gathered, though for telemetry purposes the sites are removed, which I guess would be an issue if we wanted to use the data for another purpose that does care about the particular site.)

amccreight commented 6 years ago

https://github.com/WICG/performance-memory/issues/7 and https://github.com/WICG/performance-memory/issues/8 are interesting reading about what two sites are interested in knowing via a memory API.

glandium commented 6 years ago

This is because we're periodically running our memory reporter for telemetry purposes anyways, and it gathers that information

I wonder how long that takes on my profile, because about:memory takes multiple seconds. But I don't have telemetry enabled, so I've never noticed we did that. I sure hope most of the time spent in about:memory is for the js side of things.

amccreight commented 6 years ago

The telemetry version skips some things that make it slow, like hashing strings.

EricRahm commented 6 years ago

FWIW our memory measurement is roughly asynchronous, it shouldn't really be causing jank. Perhaps it makes sense to have it declared asynchronous in the spec?

tschneidereit commented 6 years ago

[I]n Firefox we could actually report a good chunk of per-site memory, including JS and DOM

Note that we share some JS data across domains, notably functions' bytecode. Hopefully we'll be able to transfer some or all of that into a post-Fission world. I'm not sure if that makes a difference for this use case, as long as we include the shared memory in the reported numbers, but it's something to keep in mind.

dbaron commented 6 years ago

One other thought about how to deal with the fact that many of these measurements may be browser-specific and may change over time within a given browser:

past commented 6 years ago

We discussed this proposal at the web-perf call yesterday and based on Apple's feedback and ours (I summarized this thread), there is going to be another iteration to try to smooth things out. Specifically the spec will be less prescriptive about process models and requiring system metrics, and the asynchronicity aspect will be added. I will update the thread when there is a new proposal to consider.

One point in this thread that I'm not sure I understood well enough, is the one @glandium raised about this API being useful only to large sites. Is this about sites that require large amounts of memory or about sites that users spend a lot of time at (hence more opportunities for signal collection)? If it's the latter, a point that was raised is that PWAs will make long-tail web sites able to get enough signal, too.

Given the difficulty in reconciling web sites' hunger for more information and browsers' reluctance to dole it out, I suggested that it may be useful to consider if a MVP-1 spec would be implementable. That is, if even a very limited spec that web sites would find too constrained to be useful failed to get browser consensus on implementability, then perhaps this effort is not going to succeed after all.

wanderview commented 6 years ago

@past Does the spec handle resources that might be shared by multiple open tabs for a single origin? For example, SharedWorker can be attached to multiple tabs simultaneously. Do they both count that memory or is it divided or is it ignored? What about a service worker that is not explicitly attached to a tab, but might be controlling zero or more tabs of the origin?

(Sorry if this was covered already. I've only been skimming this thread.)

Edit: Note, these are also recurring issues we run into with devtools trying to map resources to a single tab, btw.

glandium commented 6 years ago

@past what I was saying is that the spec, as described, using system metrics, without the Chrome process model, would yield a lot of noise in the data due to other sites, browser internals, etc. (mostly the former). In order for the API to be any useful in that case, you need large amounts of data to compensate for the large noise.

dbaron commented 6 years ago

I think it's more than just noise, it's signal -- if the spec requires that we include everything in the process even if we know it's for a different site, then if there's a memory regression in gmail or Facebook, every site will see a memory use regression in their metrics for Firefox users, since some significant portion of Firefox users will have gmail or Facebook in the same process.

I think it's important that the spec lets us give sites metrics that are useful for them to be looking at given our process model.

past commented 6 years ago

There have been spec updates in the last few days based on the feedback from the WG. They are very much in line with what we have been discussing here, so I would encourage folks to take another look and see how it holds up now.

@wanderview the spec mentions shared memory, but not workers explicitly. It recommends that all shared memory be attributed to a single owner. The service worker point that you made sounds like something we should raise an issue for.

past commented 6 years ago

The updated spec suggests at least a 30 sec time quantization. How soon do we update the telemetry data we collect around memory usage?

wanderview commented 6 years ago

I filed https://github.com/WICG/performance-memory/issues/17 for the SharedWorker/ServiceWorker issue. I recommended possibly reporting memory for each client or environment independently.

wanderview commented 6 years ago

Note, the discussion in https://github.com/WICG/performance-memory/issues/17 suggests that perhaps the proposal does not clearly define how this API should work across origin/site/domain boundaries. I think that must be defined in order for us to consider prototyping, etc.

pweis8 commented 6 years ago

Google G Suite engineer here. We're very excited to see this proposal gain shape. To help motivate our use cases of this API, we've compiled a document summarizing our existing use cases of the Chrome only API, its shortcomings, and how a standardized API would provide significant value to us and across the web (already referenced earlier here via WICG/performance-memory#8).

Quick summary of our main use cases: A/B testing, detection of regressions in our apps, and better understanding of memory usage from real users.

A couple of brief comments from our perspective on two issues raised on this thread:

erikchen commented 6 years ago

Hi!

I worked with wanderview@ to clean up the proposal. Please take another look.

Question: What would be an appropriate communication medium for me to follow up on comments and feedback from this group?

Aside: All of your feedback has been insightful and helpful. As someone new to the web-spec space, I've been heartened by this friendly community. Thanks again for the support! =D

past commented 5 years ago

Could people please take another look at the updated proposal and see if there is anything objectionable there?

bzbarsky commented 5 years ago

I just took a look; it looks like the updated proposal is trying to take the privacy issues quite seriously, which I appreciate.

I have two questions about the spec idl:

  1. Why a callback instead of a promise return value?

  2. Why null as the "no info" value instead of 0? They seem equally easy to test, but null has the problem that it wouldn't be too hard to accidentally write code that works as long as the value is not null but throws when it is, then only test it in cases where null is not returned.

Apart from that, the main question is how implementable the must-level requirements about including everything are....

erikchen commented 5 years ago

Why a callback instead of a promise return value?

I'm insufficiently familiar with standards best practices. I'll switch it to a promise.

Why null as the "no info" value instead of 0?

Since we're adding normalized noise, it's possible for the return value to be 0 [or even negative] and have different semantics than null.

Apart from that, the main question is how implementable the must-level requirements about including everything are....

For Chrome's implementation, I was intending to rely on the assumption of site-per-process isolation, at which point we can mostly just use process memory metrics, with some touch-ups. We'll still need to return "null" for certain edge cases.

I'm insufficiently familiar with Firefox's implementation to know whether the current form of the spec is possible to implement.

If it helps -- achieving a "perfect" measurement of memory has never been the goal [and quickly devolves into semantics discussions]. I mostly want Web Developers to be able to catch large, obvious regressions [e.g. accidentally retaining 100 canvases, unconditionally allocating a large array buffer on startup].

The one case I do think we need to get right is including memory for related similar-origin browsing contexts [e.g. same origin subframe], which supports synchronous scripting and therefore can participate in retain cycles, and not including memory for different-origin browsing contexts [e.g. ads], which would add too much noise.

We should probably have a conformance test suite that checks all the cases we care about. e.g. 1) Start with empty site. Add a million DOM nodes. Memory estimate should go up. 2) Start with empty site. Add 100 canvases [1000x1000, draw into them to make sure backing buffer is dirty]. Memory estimate should go up. 3) Start with empty site. Add a different-origin iframe. In that iframe, make a million DOM nodes. Memory estimate should not change. etc.

bzbarsky commented 5 years ago

I was intending to rely on the assumption of site-per-process isolation, at which point we can mostly just use process memory metrics

I guess I'm not sure how that goes along with the "Memory estimate should be directly attributable to the web developer" guideline. There's a bunch of process memory that web developers don't really have much control over, if any. I guess that's a guideline, not a normative requirement...

Also not sure how this would handle things like GPU memory (relevant to the canvases example).

I agree that there might not be perfect answers here; just want to make sure the spec doesn't paint browsers into corners, mostly. I think that it's in pretty decent shape at this point, for what it's worth.

erikchen commented 5 years ago

I guess I'm not sure how that goes along with the "Memory estimate should be directly attributable to the web developer" guideline. There's a bunch of process memory that web developers don't really have much control over, if any. I guess that's a guideline, not a normative requirement...

Good point. I've been implicitly relying on the assumption that the absolute value does not matter, so any fixed process overhead that gets included is not relevant. We could theoretically measure process memory right before loading any third-party code, and use that as a baseline [subtracting it from future measurements]. Then there's the matter of non-fixed process memory not directly attributable to the web developer. I think that mostly becomes a semantics discussion about the definition of "attributable".

Also not sure how this would handle things like GPU memory (relevant to the canvases example).

In Chrome, we have pretty good tracking of memory that crosses process boundaries [canvas, WebGL] and I was planning on adding those onto the process memory measurement to get the final result.

I agree that there might not be perfect answers here; just want to make sure the spec doesn't paint browsers into corners, mostly. I think that it's in pretty decent shape at this point, for what it's worth.

Thanks!

I guess for most specs, we want there to be a lot of normative text so that browsers have the same behavior. Maybe this API is an exception? As long the implementation is internally consistent and comprehensive [and helps web developers find bugs!], the exact behavior is less important. Differences between browser vendors implementations are expected, since memory is an implementation detail.

Maybe we should add a disclaimer like this to the proposal?

bzbarsky commented 5 years ago

There's non-fixed stuff that may still not really be under the control of the web page (thread stacks in threadpools, font caches, etc, etc).

I think that mostly becomes a semantics discussion about the definition of "attributable".

Right.

Maybe we should add a disclaimer like this to the proposal?

Makes sense to me.

overholt commented 5 years ago

One thing @smaug---- brought up in conversation with me is that getMemoryEstimate doesn't imply the estimate is very UA- (and UA build-) specific. I should open this on the other repo ...

erikchen commented 5 years ago

Disclaimer added: https://github.com/WICG/performance-memory/pull/18/files

getMemoryEstimate doesn't imply the estimate is very UA- (and UA build-) specific.

The current proposal states this several times. Did you mean the name of the API itself? What about getMemoryEstimateUASpecific()?

overholt commented 5 years ago

Yes, I meant the name of the method :) What you propose is fine but I'm sure others more knowledgeable will have input (sorry to have asked for bikeshedding).

erikchen commented 5 years ago

overholt: No one else has voiced opinions, so I've gone ahead and changed to getMemoryEstimateUASpecific.

wanderview commented 5 years ago

Not to continue bikeshedding, but I kind of think adding "UASpecific" is a bit excessively long. Not a major issue I guess.

Another way to make it clear this is UA specific is to take the absolute units off. Instead of reporting MB, simply report a unitless UA-specific "memory load" value. It would be useful for comparing within a single browser, but not across browsers.

smaug---- commented 5 years ago

How is that clear? Then people would assume the number means something reasonable and compare between UAs. If the API returns something very UA and even UA version specific, it is better to say so.

In other specs where spec says some value is UA specific and random, we've still seen web devs complaining when API doesn't return same/similar values in all the UAs. (pointerevent's pointerId as an example)

wanderview commented 5 years ago

Ok. Happy to see this go forward without my comments.

birtles commented 5 years ago

One very naive question, is there any need for this API to be available in regular browsing sessions? For the purpose of memory regression testing, would it be sufficient to only enable it when, for example, a certain pref / command-line flag is set?

marcoscaceres commented 5 years ago

One very naive question, is there any need for this API to be available in regular browsing sessions?

In such a case, we wouldn't need a standard. We can just build our own custom thing.

birtles commented 5 years ago

One very naive question, is there any need for this API to be available in regular browsing sessions?

In such a case, we wouldn't need a standard. We can just build our own custom thing.

By that logic we wouldn't need standards such as WebDriver either.

marcoscaceres commented 5 years ago

By that logic we wouldn't need standards such as WebDriver either.

I believe that was debated :) But seriously, is the value here for web developers or browser developers? And if it's for browser developers, what's the value of having shared tests across browsers related to memory?

erikchen commented 5 years ago

This is intended for web developers. Facebook, gmail and gsuite have expressed interest in using the API. Memory usage of these sites is dependent on the content being displayed, and local testing may not be representative of real usage in the wild.

erikchen commented 5 years ago

Status update: I am pausing work on the API proposal.

In the Blink-Dev intent-to-implement thread, the concern was raised that the API would leak information about the size of cross-origin resources.

One proposal to mitigate this is to block the API on a feature-policy that turns all resource requests into CORS requests. This seems generally desirable anyways, but is a long way off from being possible on the majority of websites.

Another possibility would be to manually track the size of cross-origin resources [and associated metadata] and exclude them from the total. That is prone to error and edge cases -- I don't want potentially easy-to-commit errors to have severe privacy implications.

Figuring out how to help Web Developers reduce the memory usage of their sites still seems like a high priority, but this doesn't seem to be the right approach.

Thank you all for the support during this process, and the many iterations on the proposal. Hopefully we'll get another chance to work together in the future. :)

marcoscaceres commented 5 years ago

Closing, as incubation of this proposal was halted.