JS Self-Profiling API - Githubissues

acomminos commented 3 years ago

Request for Mozilla Position on an Emerging Web Specification

Specification Title: JS Self-Profiling API
Specification or proposal URL:
- Draft spec: https://wicg.github.io/js-self-profiling/
- Explainer: https://github.com/WICG/js-self-profiling/
Mozillians who can provide input: @mstange
Bugzilla URL: https://bugzilla.mozilla.org/show_bug.cgi?id=1393617

Other information

TAG review: https://github.com/w3ctag/design-reviews/issues/366
This API is implemented in Blink, and was run in an origin trial.
~Due to concerns about exposing sensitive timing information, we've decided from discussions in the Web Perf Working Group to require cross-origin isolation for the API.~ We believe this API exposes no more signal than is already possible through manual instrumentation -- see the following security review from the Edge team.
@mstange and I spoke a while back regarding how we could avoid introducing profiling-related overhead on pages that wouldn't use the API. We settled on requiring a header, which is exposed via the document policy mechanism -- hopefully this should address concerns related to any additional up-front work Gecko may have to do to enable this API at the SpiderMonkey level.

Thank you!

annevk commented 3 years ago

Has this been reviewed by TC39? cc @codehag

From a quick review it seems this proposal lacks a processing model. I filed a number of issues around that and other things I spotted.

codehag commented 3 years ago

This proposal wasn't reviewed by TC39, afaik and in relation to what I could find in the notes. I took a quick look at the proposal itself, but will probably need a bit more time to think about it.

It looks like we still have an open issue about this same topic: https://github.com/mozilla/standards-positions/issues/92, @martinthomson should probably take a look at it since then.

martinthomson commented 3 years ago

This is a question that @annevk is best suited to answer as the response to side-channel attacks is process isolation. That's the same mitigation we are using for SharedArrayBuffer and finer-grained high resolution timers. Given the nature of the proposal, I would also like to understand what @codehag concludes also (or anyone who might be more familiar with SpiderMonkey and profiler).

annevk commented 3 years ago

I filed https://github.com/WICG/js-self-profiling/issues/38 on the security question as I don't really understand the security section.

sefeng211 commented 3 years ago

Side note: Apple was strongly against this API during TPAC 2020.

acomminos commented 3 years ago

Thanks for the feedback provided thus far, folks. I'll chip away at responding to the issues filed.

It looks like we still have an open issue about this same topic: #92

Ah, missed this. Apologies!

Side note: Apple was strongly against this API during TPAC 2020.

It's worth noting that their primary objection was reducing performance for users that were profiled (edit: see @rniwa's comment below). This seems like a UA-specific position that likely depends on VM characteristics.

mstange commented 3 years ago

Last time I discussed this with other members of the performance team (jesup being one of them), the general feedback I got was "maybe, but let's ship Fission first" - i.e. we really don't want to enable this as long as code from other web pages can run in the same process.

My other concerns were:

Consistency of stack frame names (especially anonymous functions) across browsers.
Performance impact from profiling, possibly even varying depending on what JIT tier the profiled code runs in.
Robustness / stability concerns of the current Firefox implementation.

For 1, I think Andrew and Benoit assured me that sites would need browser-specific pipeline for this data anyway, and profiles from different browsers wouldn't be mixed, so different stack frame names would not be an isue. For 2, I think the argument was "it's not going to be worse than our polyfill". 3 is a problem with our implementation - we've never fuzzed our profiler and there might be security bugs or other stability issues in our implementation that we would need to fix before exposing it to the web.

rniwa commented 3 years ago

FWIW, Apple's WebKit team does not support this API due to including but not limited to the following reasons:

Enabling our sampling profiler incurs the performance cost of ~5% due to the added runtime cost of gathering backtrace samples and due to having to disable certain optimizations like inline. We don't want websites to be slowing down even a subset of users' device by enabling this feature. Even if some websites are already doing it, we don't want to make it easier by adding Web API.
The use of this API will result in a large memory usage for having to gather backtraces and summarize the result.
Native apps don't have a mechanism like this yet they're capable of optimizing their apps. There were a number of discussions in Web Perf WG about long tasks but many long tasks are attributable to cross-origin content (e.g. ad). We clearly can't allow this feature to sample scripts from other origins.
There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

shhnjk commented 3 years ago

4. There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

I did a security review of this API, and I think this is hard today, because the sampling interval is implementation dependent (i.e. minimum 10 milliseconds for Chromium).

rniwa commented 3 years ago

There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

I did a security review of this API, and I think this is hard today, because the sampling interval is implementation dependent (i.e. minimum 10 milliseconds for Chromium).

I'm a bit confused about this commentary. Surely, an attacker will target a specific engine & attacker controlled called, in which case, they might be able to infer a lot of side channel information about the state of the compiler or the compiled code. To be clear, I'm not talking about inferring information about the contents of cross site/origin content and such. I'm specifically talking about attacking user's device.

shhnjk commented 3 years ago

There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

I did a security review of this API, and I think this is hard today, because the sampling interval is implementation dependent (i.e. minimum 10 milliseconds for Chromium).

I'm a bit confused about this commentary. Surely, an attacker will target a specific engine & attacker controlled called, in which case, they might be able to infer a lot of side channel information about the state of the compiler or the compiled code. To be clear, I'm not talking about inferring information about the contents of cross site/origin content and such. I'm specifically talking about attacking user's device.

The API gives you timing information of which function are executing at the time of sampling. If an attacker wants to know that information, it's a lot simpler to add performance.now to beginning and the end of each function, which will provide much more precise timing information.

Basically, the limitation of sampleInterval makes it less interesting for attackers.

rniwa commented 3 years ago

I'm a bit confused about this commentary. Surely, an attacker will target a specific engine & attacker controlled called, in which case, they might be able to infer a lot of side channel information about the state of the compiler or the compiled code. To be clear, I'm not talking about inferring information about the contents of cross site/origin content and such. I'm specifically talking about attacking user's device.

The API gives you timing information of which function are executing at the time of sampling. If an attacker wants to know that information, it's a lot simpler to add performance.now to beginning and the end of each function, which will provide much more precise timing information.

Basically, the limitation of sampleInterval makes it less interesting for attackers.

This isn't necessary the case for JSC / WebKit but I don't want to pollute Mozila's issue tracker with specific details of how JSC / WebKit's sampling profiler or our JIT compilers work so I'd leave it at that. All I can say is that in our engine, these are legitimate concerns.

mozilla / standards-positions

JS Self-Profiling API #477

Request for Mozilla Position on an Emerging Web Specification

Other information