ooni / probe

OONI Probe network measurement tool for detecting internet censorship
https://ooni.org/install
BSD 3-Clause "New" or "Revised" License
766 stars 142 forks source link

Add support for outage detection #894

Open hellais opened 4 years ago

hellais commented 4 years ago

It was brought up on the #ooni channel by @carrotcypher that we may want to have some sort of very minimal probe which does some low throughput operation over time and sends a signal in case it's unable to do it.

This is similar to what we were discussing in Investigating Internet Blackouts. From the edge of the network, where we said:

Outage detection
As a first step we need to have some form of heuristic that allows us to understand that a
particular device is experiencing some form of network outage. This can be used as an indicator
to then trigger more fine-grained and in-depth measurements.
Since this need to be done with a fairly high frequency, it’s crucial that what we do to detect
an outage consumes minimal amounts of network bandwidth and that we reserve the most
bandwidth intensive measurements for the follow-up stage.
Each attempt to fetch some minimal document from HTTPS server is ~6KiB of data sent over
the wire: DNS for A and AAAA, TLS handshake and teardown. That’s ~17MiB a month if the test is
done every ~15 minutes. The value of 15min comes from minimal inexact interval supported by
Android’s AlarmManager.4
Failures should trigger follow-up measurements to ensure that it’s something that looks like
a blackout and not just a temporary OONI service failure or blockage, last-mile failure (heavy
wifi interference, or broken LAN switch, CPE failure), ISP subscription termination (e.g. quota
depletion) or network glitch.

It is quite challenging to do this on mobile, as there are battery consumption constraints to take into account, however with the new desktop app maybe we can do something useful in there.

carrotcypher commented 4 years ago

The idea I have is something I'll be referring to as a signal canary, where rather than intermittent pings (which act more like an echo beacon), it would mimic a warrant canary — where a change in state is considered to be information itself. This means rather than incremental connections to test connectivity, the signal canary would maintain a constant connection to other signal canaries over a p2p network.

In this model, we can imagine canaries in each country, run by individuals and institutions, all connecting to each other to establish themselves as valid canaries. Once a significant amount of established signal canaries disconnect from a specific country in a time frame and pattern consistent with network outages and blocks, the rest of the signal canaries would sound an alarm, optionally then publishing that alarm to OONI somehow for further investigation. Important to note that this would have some false positives, not automatically signifying blocks, outages, or censorship at play, but due to the nature of an existing established connection dropping rather than simply not being able to be established, the signal to noise ration would be much higher in this model.

As this model functions by requiring solid connections and potentially reading disconnections as data itself, it is not ideal for a mobile platform implementation. Desktop users (either as a daemon or a browser plugin) on the other hand, might be ideal. Additionally, while it is designed with scalability through community support in mind, the boostrap connections could always be the usual institutional supporters in different countries, provide a strong level of actionable data without any additional community participation necessary.

Comments and criticisms to this idea are welcome.

bassosimone commented 4 years ago

I think it’s a very good idea 👍

carrotcypher commented 4 years ago

I'm looking at Noise[manual][code] right now for a proof-of-concept. It already has all the p2p functionality signal canary would need and is basically a p2p chatroom (as seen below in GIF), so that channel could be used to intermittently broadcast to the channel the node's signed proof-of-freshness and optionally identity too.

1_pnGLLKHJnM8ObccwnrkRDg

As all participants in the channel would effectively see JOIN/QUIT, you could simply store (and optionally routinely prune) a db of active connections with their geolocation data (either provided, sourced from geolocation service, or redundantly both), then have code that triggers events based on specific criteria such as:

I'll try to put together a proof-of-concept and share it here.

carrotcypher commented 4 years ago

Problems encountered going this direction:

Will be taking this middle ground approach of 8 static and 8 cycling for further testing. Comments and criticisms welcome.

carrotcypher commented 4 years ago

As of posting this (and unless otherwise updated below), I've put development of the proof-of-concept on hold for the time being as I don't have the time to contribute to it at the moment. If anyone else would like to take a stab at it, I'd be happy to share notes and discuss what I've already come to understand about the limitations, issues, and functionality ideas.