w3c / sensors

Generic Sensor API
https://www.w3.org/TR/generic-sensor/
Other
127 stars 59 forks source link

Javascript 120Hz devicemotion events for high end inertial applications #98

Open lukes3315 opened 8 years ago

lukes3315 commented 8 years ago

Browsers only sample sensor data at a varying 67Hz (drops down to 1Hz sometimes), which makes the readings unusable for high end use cases... We need a robust 120Hz steady sampling rate for our human inertial navigation technology. iOS and Android do have this natively, but apparently current browser technology does not offer the same quality as the native sampling... Sadly :(

tobie commented 8 years ago

Thanks for your input. This is addressed by the spec for sensors which support periodic reporting mode.

Which particular sensors would you need to poll at 120Hz?

Why 120Hz and not 1KHz or 67Hz?

What are you aiming to achieve with this? E.g. is this a latency issue, are you filtering the data and need a number of data points to do so? etc.

What are the precise use cases you're targeting?

Thanks for your input.

lukes3315 commented 8 years ago

We would need:

And if possible:

Well 120Hz is sufficient granularity for our application. 67Hz is not granular enough, 1Khz, iOS and Android will not let you poll at such a high frequency. But if you want to have a sample rate of 1Khz that isn't an issue as long as it is adjustable.

We are working on an indoor location technology based on inertial navigation. The current issue is the inconsistency in the data points, for example: Timing between samples seems to be inconsistent and inaccurate.

Check us out: www.navisens.com

tobie commented 8 years ago

Thanks for this added information.

tobie commented 7 years ago

@lukes3315

I have a couple of options for this, please LMK which are acceptable.

  1. polling at 120 Hz, getting roughly two data points together each time requestAnimationFrame is called?
  2. polling at 120 Hz, getting only the latest data point when requestAnimationFrame is called (this is good for latency, not so much if you want all data point, e.g. for filtering)?
  3. polling at 120Hz, receiving updates at 120Hz via event observers.
tobie commented 7 years ago

Expecting feedback from Navisens shortly, see: https://twitter.com/navisens/status/824627997797801985.

lukes3315 commented 7 years ago

Hi Tobie,

We think there should be two acquisition methods supported: 1/ Polling sensor samples where you request 1 to N samples which are then received at each polling loop 2/ Passing a callback function to receive 1 to N samples (preventing polling for each individual sample).

Having the option to batch N samples would allow for a tradeoff between low latency and greater efficiency depending on the application.

lukes3315 commented 7 years ago

Please feel free to reach out if you need more information.

tobie commented 7 years ago

Thanks, @lukes3315.

What do you mean by "polling loop"? I'm also not sure what you mean by option 2 either. :-/

Maybe code examples or f2f would be helpful?

tobie commented 7 years ago

Had a meeting with Luke (@lukes3315) and Ash (@adonikian) of Navisens. Navisens does indoors positioning without touching the GPS, relying instead on motion sensors in users' mobile phones. It collects readings on the device an sends them to remote servers for analysis where it notably corrects for drift using machine learning.

Currently, Navisens is available as native apps. The current max sampling rate of the DeviceOrientation API (~60 Hz) isn't enough for their requirements.

They's ideally need 120 Hz, but could work with 100 Hz.

The sampling intervals need to be consistant. And it's critical that no samples are lost. [Ed: I'd love to understand more about about why that's important.]

Latency, on the other hand, is less an issue, as the readings as sent to a server for analysis anyway.

The kind of API they have in mind would allow them to set the size of the sample batch they want to work with and get an event every time the batch is full so they can send the data over to their servers for analysis.

[Ed: it would be useful to get a better understanding of the kind of analysis that's being done on the readings to justify the high sampling rates requirements and the necessity of not dropping samples and have an idea of the batch sizes.]

tobie commented 7 years ago

Hearing similar use cases from @padenot:

for example, gyroscope at 120Hz sending control information via websocket to a web audio api program, controling various parameter of a synthetizer

@padenot: would be amazingly helpful if you could provide requirements around:

padenot commented 7 years ago

latency (from time the data point is collected to time you get to use it),

This is very dependent on your application. It is generally considered that a normal human (i.e. not a professional drummer, for example) considers that sound happens "immediately" if the delay between doing the thing that makes your sensor react (be it a gyroscope, a fader, a knob, a keyboard press, etc.) is less than 20ms.

Now, you should experiment. Depending on the interaction, the type of sound, the parameter that is being modulated, etc., it is possible that you'll find that 100ms is acceptable. Keep in mind that the output latency of a normal computer using the Web Audio API is between 10ms and 40ms. It's possible to bring this down with special hardware and browser builds.

frequency (how many samples per seconds you need), and

It depends on the use case, for controlling parameters. Any rate is manageable with the Web Audio API, since you're simply doing js calls. There is a possibility to schedule a bunch of things in advance, or to react just in time, etc.

whether it's OK to batch the samples together (and if so, how often you need these batches sent).

This is ok, iff you can take the latency hit caused by the packetizing of the data.

tobie commented 7 years ago

@padenot thanks for your comments.

(For the fun story, I was a professional drummer in a previous life, dabbled with controlling early versions of Live and NI software with a drumKAT and experienced sub 20ms issues first hand when studios moved from analog to digital recording.)

To give you a bit more context, there are security concerns about providing > 60 Hz frequency and even if we managed to work around this, there are perf concerns around providing the data samples faster than the animation frame rate.

If this is a concern for some use cases, I'd like to hear about it, to see if those are use cases we want to fight for or knowingly exclude.

padenot commented 7 years ago

Not for music in the normal sense, but critical for VR and related technologies. You can't really do anything competitive in VR with 60Hz data.

tobie commented 7 years ago

@padenot OK, cool. (I'm aware of the VR requirements, they generally want higher frequency to reduce latency, but don't really care about more than one data point per animation frame. When they do, they're fine getting them batched).

tobie commented 7 years ago

From the feedback gathered so far here, it seems a reporting frequency whose max is tied to the animation frame rate is sufficient.

On the other hand, a faster polling frequency seems necessary to either:

  1. gather more data points (indoor navigation use case), or
  2. decrease latency (VR use case).

There are privacy concerns with sampling > 60 Hz, though my understanding is research by @maryammjd seems to indicate this is an issue at much lower frequency rates already. (@maryammjd can you please confirm, plus point to relevant papers?)

When polling frequency is higher than reporting frequency and we don't want to loose samples (so not the VR case), we need to buffer samples. This needs discussion as to how best to achieve this. BYOB? Dedicated API? Etc.

adonikian commented 7 years ago

@tobie my response to:

The sampling intervals need to be consistent. And it's critical that no samples are lost. [Ed: I'd love to understand more about about why that's important.]

This is very important, because most applications using inertial sensors use incremental estimation where the current state (e.g. position, velocity, attitude) is composed of a a set of successive relative state estimates. For inertial sensors, this is not just dependent on the sensor readings but also having correct timing information, since the sensor readings are often integrated. Thus, if a sample is lost, it is generally unrecoverable (other than trying to compensate with global information such as the magnetometer).

maryammjd commented 7 years ago

Hi all, may I please ask you to clarify what is the exact problem? In terms of sampling rate, we have never experience 1 Hz with motion and orientation (we have been testing different browsers for years). Is this something browser-related or network-related (e.g. connection problems)?

From the security point of view, as long as the right permission is given to the eligible code, we don't mind the sampling rate. In fact, we believe increasing the sampling rate both on native apps and js in inevitable.

tobie commented 7 years ago

@maryammjd, thanks for your reply. A couple of comments:

Hi all, may I please ask you to clarify what is the exact problem?\

Security and privacy concerns of using sensors at frequencies above 60 Hz.

In terms of sampling rate, we have never experience 1 Hz with motion and orientation (we have been testing different browsers for years). Is this something browser-related or network-related (e.g. connection problems)?

Not sure what you mean. I don't see any reference to 1 Hz in the whole thread. Did you misread 1 KHz way above?

From the security point of view, as long as the right permission is given to the eligible code, we don't mind the sampling rate. In fact, we believe increasing the sampling rate both on native apps and js in inevitable.

Vendor security teams do care however. I'm interested in determining whether they are right to care or not. That is, does increasing frequency create new attack scenarios (e.g. key logging of a nearby keyboard, pin logging on touch phones, crude voice recording, etc). And if so, at what frequency do they appear? What mitigation strategy exist beyond limiting the frequency (e.g. making sure the sensor doesn't report reading when the user is entering their pin code).

maryammjd commented 7 years ago

The very first comment of this thread says: "Browsers only sample sensor data at a varying 67Hz (drops down to 1Hz sometimes)".

Research-wise, sensors at any frequency (even 20 Hz) are able to reveal something about the user's activity (touch actions, PINs, etc.) or his surrounding environment. We can't say 5 hz or 60 hz or 200 hz is safe or not. However, the higher frequency, the better attack results; the lower frequency, the more expensive machine learning techniques, the attackers will find their way through anyway. That is why we advise to have design-level security models rather than implementation limitation solutions i.e. playing with the sampling rates.

tobie commented 7 years ago

The very first comment of this thread says: "Browsers only sample sensor data at a varying 67Hz (drops down to 1Hz sometimes)".

Indeed. Sorry. :)

Research-wise, sensors at any frequency (even 20 Hz) are able to reveal something about the user's activity (touch actions, PINs, etc.) or his surrounding environment. We can't say 5 hz or 60 hz or 200 hz is safe or not. However, the higher frequency, the better attack results; the lower frequency, the more expensive machine learning techniques, the attackers will find their way through anyway. That is why we advise to have design-level security models rather than implementation limitation solutions i.e. playing with the sampling rates.

Do you have pointers to papers that discuss this? That would be super helpful!

maryammjd commented 7 years ago

TouchSignatures: Identification of User Touch Actions and PINs Based on Mobile Sensor Data via JavaScript

Results are presented for Touch Actions on iPhone (20 Hz), and PIN digits on iPhone (20 Hz) and Android (<60 Hz)

tobie commented 7 years ago

Thanks are you aware of other papers that would show similar patterns for other privacy/security concerns?

maryammjd commented 7 years ago

Systematic Classification of Side-Channel Attacks: A Case Study for Mobile Devices

A very good paper on all the research in this area.

tobie commented 7 years ago

Systematic Classification of Side-Channel Attacks: A Case Study for Mobile Devices A very good paper on all the research in this area.

Thanks, @maryammjd, that's useful background info indeed, but doesn't really address in more details the question of increased risk with increased sensor frequency.

maryammjd commented 7 years ago

As far as I am aware that question has not been explored particularly. The idea of finding a 'safe zone' for the sensors sampling rates to have both functionality/usability and security is a topic of research on its own.

tobie commented 7 years ago

The idea of finding a 'safe zone' for the sensors sampling rates to have both functionality/usability and security is a topic of research on its own.

Precisely!

lknik commented 7 years ago

I agree that it's difficult to decide on "optimal" frequencies. First you have security/privacy and leak risks. Then you have usability. Even certain data at 0.1 Hz can still provide sensitive information. Of course the more larger the frequency, the more sophisticated models can be devised. I don't think it's possible to find a "safe zone".

It all depends on the risk/attack type and what type of "information" is to be inferred/extracted. So some attacks will work better at 120, 60, 20 Hz, etc. In some cases, even if you have few readout values available, you can "connect the dots" and infer useful information. Obviously the more the dots, the more reliable attacks are. Each sensor has its own issues, and it's unlikely we can find a catch-all solution.

Let's find a possible risk of inferring keystrokes from this paper. See the Figure 5 on page - the accuracy stabilizes at some point (but that's a model-dependent feature!).

"We observe that as the sampling rate increases to 100Hz, we obtain significantly higher individual key accuracies. This find- ing informs a potential resolution to the security concerns raised by our findings: by enforcing a low sampling rate on the accelerom-eter for untrusted applications or background applications, one can substantially mitigate the predictive accuracy of keystroke inference models."

In this particular risk scenario using this particular model, limiting frequency could help.

I think we've already mentioned possibilities of asking permissions for high-freq in another thread. Not sure how feasible would that be technically to ask a user for a permission to use a particular frequency?

tobie commented 7 years ago

I agree that it's difficult to decide on "optimal" frequencies. First you have security/privacy and leak risks. Then you have usability. Even certain data at 0.1 Hz can still provide sensitive information. Of course the more larger the frequency, the more sophisticated models can be devised. I don't think it's possible to find a "safe zone".

Plus much like with crypto, I imagine such a safe zone would continuously shift.

It all depends on the risk/attack type and what type of "information" is to be inferred/extracted. So some attacks will work better at 120, 60, 20 Hz, etc. In some cases, even if you have few readout values available, you can "connect the dots" and infer useful information. Obviously the more the dots, the more reliable attacks are. Each sensor has its own issues, and it's unlikely we can find a catch-all solution.

Fair point.

I think we've already mentioned possibilities of asking permissions for high-freq in another thread. Not sure how feasible would that be technically to ask a user for a permission to use a particular frequency?

It's technically bordering on the trivial. From a usability point of view is disastrous however. How would you expect a user that probably never heard what a gyroscope is and might not know what hertz are, to decide a what frequency as sensor should be polled?

tobie commented 7 years ago

hey @tobie - look at how I've updated my previous comment :)

Arg. Don't do that. :D

lknik commented 7 years ago

It's technically bordering on the trivial. From a usability point of view is disastrous however. How would you expect a user that probably never heard what a gyroscope is and might not know what hertz are, to decide a what frequency as sensor should be polled?

We can't expect any user to know what a sensor is - and even care about it. Maybe decide on "Do you allow this site to access your movement with [very high|high|basic] precision", but still... Not so clear. Definitely can't ask for "do you agree this site to access turboencabulator at 207 kHz"...

tobie commented 7 years ago

Let's find a possible risk of inferring keystrokes from this paper. See the Figure 5 on page - the accuracy stabilizes at some point (but that's a model-dependent feature!).

"We observe that as the sampling rate increases to 100Hz, we obtain significantly higher individual key accuracies. This find- ing informs a potential resolution to the security concerns raised by our findings: by enforcing a low sampling rate on the accelerometer for untrusted applications or background applications, one can substantially mitigate the predictive accuracy of keystroke inference models."

In this particular risk scenario using this particular model, limiting frequency could help.

Thanks to the link to the paper. I remember reading this a while back. I'm super happy I'm now able to link to it. This is exactly the kind of reference I was looking for.

The difference in accuracy between 60 Hz and 200 Hz is minute, which is precisely what I wanted to demonstrate. It makes enabling use cases such as this one a no-brainer imho.

lknik commented 7 years ago

Just be careful with extrapolating Figure 5 to all the sensors out there - or even a particular one - it will be misleading (as I wrote). It also applies to a particular risk scenario, and a particular description model. In no way it's generic, but the feeling might be similar in some respects - the mechanics of accuracy is improving but then gradually shifts to a place where it no longer makes much difference. Still, going for "OK so 10 Hz is fine" based on Figure 5 would be not quite proper, in general.

tobie commented 7 years ago

Just be careful with extrapolating Figure 5 to all sensors - or even a particular one is misleading (as I wrote). It also applies to a particular risk scenario, and a particular description model. In no way it's generic, but the feeling might be similar in some respects - the mechanics of accuracy is improving but then gradually shifts to a place where it no longer makes much difference. Still, going for "OK so 10 Hz is fine" based on Figure 5 would be not quite proper, in general.

100% with you. The current position from implementors is that there's a cut off point at 60 Hz beyond which they're not willing to go for security reasons.

Both your data and the data provided by @maryammjd above indicate that not only this cut off point is arbitrary, but its also too high to protect against threats for which we have data.

Thus we should get rid of it.

tobie commented 7 years ago

Chrome bug on the topic with references to two other papers: https://bugs.chromium.org/p/chromium/issues/detail?id=421691

maryammjd commented 7 years ago

As far as I know, Chrome is aware of this issue. As you can see the bug has been reported in 2014.

tobie commented 7 years ago

As far as I know, Chrome is aware of this issue. As you can see the bug has been reported in 2014.

Yes they are indeed. :)

timvolodine commented 7 years ago

To chime in here, I remember two issues regarding increased frequency:

IIRC in crbug.com/421691 one of the papers talks about audio attacks and increasing the sampling rate naturally increases the risk and accuracy of such attacks.. As was already mentioned it's not exactly clear what the reasonable threshold is though and the actual impact of marginal increase.

To reduce potential risks we can restrict the usage e.g. to top main frame only, and/or by requiring a user gesture. Additionally we could make the usage opt-in e.g.