mozilla / standards-positions

https://mozilla.github.io/standards-positions/
Mozilla Public License 2.0
611 stars 69 forks source link

JS Self-Profiling API #477

Open acomminos opened 3 years ago

acomminos commented 3 years ago

Request for Mozilla Position on an Emerging Web Specification

Other information

Thank you!

annevk commented 3 years ago

Has this been reviewed by TC39? cc @codehag

From a quick review it seems this proposal lacks a processing model. I filed a number of issues around that and other things I spotted.

codehag commented 3 years ago

This proposal wasn't reviewed by TC39, afaik and in relation to what I could find in the notes. I took a quick look at the proposal itself, but will probably need a bit more time to think about it.

It looks like we still have an open issue about this same topic: https://github.com/mozilla/standards-positions/issues/92, @martinthomson should probably take a look at it since then.

martinthomson commented 3 years ago

This is a question that @annevk is best suited to answer as the response to side-channel attacks is process isolation. That's the same mitigation we are using for SharedArrayBuffer and finer-grained high resolution timers. Given the nature of the proposal, I would also like to understand what @codehag concludes also (or anyone who might be more familiar with SpiderMonkey and profiler).

annevk commented 3 years ago

I filed https://github.com/WICG/js-self-profiling/issues/38 on the security question as I don't really understand the security section.

sefeng211 commented 3 years ago

Side note: Apple was strongly against this API during TPAC 2020.

acomminos commented 3 years ago

Thanks for the feedback provided thus far, folks. I'll chip away at responding to the issues filed.

It looks like we still have an open issue about this same topic: #92

Ah, missed this. Apologies!

Side note: Apple was strongly against this API during TPAC 2020.

It's worth noting that their primary objection was reducing performance for users that were profiled (edit: see @rniwa's comment below). This seems like a UA-specific position that likely depends on VM characteristics.

mstange commented 3 years ago

Last time I discussed this with other members of the performance team (jesup being one of them), the general feedback I got was "maybe, but let's ship Fission first" - i.e. we really don't want to enable this as long as code from other web pages can run in the same process.

My other concerns were:

  1. Consistency of stack frame names (especially anonymous functions) across browsers.
  2. Performance impact from profiling, possibly even varying depending on what JIT tier the profiled code runs in.
  3. Robustness / stability concerns of the current Firefox implementation.

For 1, I think Andrew and Benoit assured me that sites would need browser-specific pipeline for this data anyway, and profiles from different browsers wouldn't be mixed, so different stack frame names would not be an isue. For 2, I think the argument was "it's not going to be worse than our polyfill". 3 is a problem with our implementation - we've never fuzzed our profiler and there might be security bugs or other stability issues in our implementation that we would need to fix before exposing it to the web.

rniwa commented 3 years ago

FWIW, Apple's WebKit team does not support this API due to including but not limited to the following reasons:

  1. Enabling our sampling profiler incurs the performance cost of ~5% due to the added runtime cost of gathering backtrace samples and due to having to disable certain optimizations like inline. We don't want websites to be slowing down even a subset of users' device by enabling this feature. Even if some websites are already doing it, we don't want to make it easier by adding Web API.
  2. The use of this API will result in a large memory usage for having to gather backtraces and summarize the result.
  3. Native apps don't have a mechanism like this yet they're capable of optimizing their apps. There were a number of discussions in Web Perf WG about long tasks but many long tasks are attributable to cross-origin content (e.g. ad). We clearly can't allow this feature to sample scripts from other origins.
  4. There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.
shhnjk commented 3 years ago

4. There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

I did a security review of this API, and I think this is hard today, because the sampling interval is implementation dependent (i.e. minimum 10 milliseconds for Chromium).

rniwa commented 3 years ago
  1. There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

I did a security review of this API, and I think this is hard today, because the sampling interval is implementation dependent (i.e. minimum 10 milliseconds for Chromium).

I'm a bit confused about this commentary. Surely, an attacker will target a specific engine & attacker controlled called, in which case, they might be able to infer a lot of side channel information about the state of the compiler or the compiled code. To be clear, I'm not talking about inferring information about the contents of cross site/origin content and such. I'm specifically talking about attacking user's device.

shhnjk commented 3 years ago
  1. There are concerns regarding potential disclosure of security sensitive timing information such as how long a given code took to compile, how compiler optimized given code, etc... which are a lot harder to infer today.

I did a security review of this API, and I think this is hard today, because the sampling interval is implementation dependent (i.e. minimum 10 milliseconds for Chromium).

I'm a bit confused about this commentary. Surely, an attacker will target a specific engine & attacker controlled called, in which case, they might be able to infer a lot of side channel information about the state of the compiler or the compiled code. To be clear, I'm not talking about inferring information about the contents of cross site/origin content and such. I'm specifically talking about attacking user's device.

The API gives you timing information of which function are executing at the time of sampling. If an attacker wants to know that information, it's a lot simpler to add performance.now to beginning and the end of each function, which will provide much more precise timing information.

Basically, the limitation of sampleInterval makes it less interesting for attackers.

rniwa commented 3 years ago

I'm a bit confused about this commentary. Surely, an attacker will target a specific engine & attacker controlled called, in which case, they might be able to infer a lot of side channel information about the state of the compiler or the compiled code. To be clear, I'm not talking about inferring information about the contents of cross site/origin content and such. I'm specifically talking about attacking user's device.

The API gives you timing information of which function are executing at the time of sampling. If an attacker wants to know that information, it's a lot simpler to add performance.now to beginning and the end of each function, which will provide much more precise timing information.

Basically, the limitation of sampleInterval makes it less interesting for attackers.

This isn't necessary the case for JSC / WebKit but I don't want to pollute Mozila's issue tracker with specific details of how JSC / WebKit's sampling profiler or our JIT compilers work so I'd leave it at that. All I can say is that in our engine, these are legitimate concerns.