ooni / probe-engine

Semi-automatic export of https://github.com/ooni/probe-cli internals
https://ooni.org
GNU General Public License v3.0
45 stars 16 forks source link

Improve basic on-device data analysis #87

Closed bassosimone closed 4 years ago

bassosimone commented 4 years ago

I am going to mark this issue as complete. The https://github.com/ooni/netx library is the place where I have implemented the following functionality:

  1. extraction of low-level events that are "rooted" to a specific HTTP request, dial for cleartext or TLS connection, system/UDP/TCP/DoH/DoT domain name resolution

  2. the above has been implemented by mocking several Go standard library interfaces and standard HTTP-observing callbacks to obtain as much events as possible without adding artificial restrictions (e.g. not running concurrent measurements)

  3. easy binding of such events when post-processing data because there are a bunch of consistent IDs that allow to join data collected at DNS level, with data collected a network level, with data collected at HTTP level

  4. easy attribution of errors to major operations that failed (DNS resolution, TCP connect, TLS handshake, HTTP protocol interaction) by returning an implementation of the error interface that contains this information (this is less easy that it seems; when a DoH resolve fails because of a TLS handshake error, you want the failed operation to be the TLS handshake, not the parent DNS lookup)

  5. functionality called "scoreboard" to record odd stuff that we've noticed and we want to investigate later

  6. implementation of automatic SNI blocking follow-up mini-experiment based on the "scoreboard" that has been discussed with @fortuna as part of a broader scope design document

  7. automatic detection of "bogon" replies (e.g. 10.4.4.17) and fallback to preconfigured DoT or DoH resolvers in such case, thus also performing an opportunistic DoT/DoH mini experiment

To implement all of this, I needed to deeply refactor the library. I have tagged this issue in a bunch of refactoring PRs that are mostly relevant to it, however, I'd say that all commits since https://github.com/ooni/netx/commit/556048a6d9062ff3ab84c410b4bdef19095b2360 until https://github.com/ooni/netx/commit/aa7ac6327b11545103def2a4b38807dfa8c18c32 have been guided by the objective of implementing all the above.

On this note, I also wrote a tool https://github.com/ooni/jafar to simulate several kind of censorship and of unreliable networks (for https://github.com/ooni/probe-engine/issues/88) such that I could test whether the code I was writing was correctly measuring specific scenarios and was able to correctly bypass SNI blocking and react to returned bogon address.