ooni / data

OONI Data CLI and Pipeline v5
https://docs.ooni.org/data
8 stars 4 forks source link

Initial plots of blocking type breakdowns #22

Closed hellais closed 1 year ago

hellais commented 1 year ago

The goal of this is to start thinking about how we are going to be plotting the various blocking methods, but also evaluate the accuracy of the analysis.

hellais commented 1 year ago

This first chart looks at the blocking types focusing on a single ASN in a given day (in this case 2022-11-01).

While it's very large vertically (though be prepared for the next one being even bigger), I find that by applying some reasonable sorting of the x and y axis, it does turn out to be fairly legible.

At the top of the chart you will find all the domains that are likely to be blocked and towards the end you will find all the ones that are not blocked.

I like how with this kind of visualisation it's possible to visually tell which sites are blocked using the same technique (the marks are aligned vertically) and check for consistency in blocking methods.

As per the ooni/data model, we map the outcome into one of 3 broad classes "blocked" (the site is not accessible from the measured vantage point and that is not due to the site being down), "down" (the site is not accessible, but it's also not accessible elsewhere, that is it's down), "ok" (nothing to see here, carry on).

The "outcome_score" indicates how confident we are about the result, with a score of 1.0 indicating absolute certainty (i.e. confirmed, although as I am currently testing the analysis engine I don't use the blockpage signature to compute the score since that will not tell us good the engine is in lack of any existing information).

The "metric_count" indicates how many distinct metrics were used to compute the score.

Note: this count is different from the measurement count, since we are breaking down a measurement into many distinct observation that are then re-analysed to produce "experiment_results" (see: https://github.com/ooni/data#architecture-overview for more details on this).

Click to view chart Website Blocking Russia AS50716
hellais commented 1 year ago

This chart explores the blocking of sites across all ASN.

Again it can be read similarly to the previous one, where you find towards the top sites that are most likely to be blocked, followed by sites that are likely down, followed by sites which are OK.

Given the big variety in blocking methods it tends to become quite wide. We might decide to collapse some of these outcome_labels into one class in cases in which they are equivalent.

Click to view chart Website Blocking Russia
hellais commented 1 year ago

Progress on this has been done as part of the several research reports. I'm not sure what more needs to be done to consider this issue done, so I am going to close it in favour of creating more specific ones in the future.

It is also related to the data model we decide to use for the experiment results tables, see: https://github.com/ooni/ooni.org/issues/1282