ooni / backend

Everything related to OONI backend infrastructure: ooni/api, ooni/pipeline, ooni/sysadmin, collector, bouncers and test-helpers
BSD 3-Clause "New" or "Revised" License
49 stars 29 forks source link

Implement test_coverage and country_overview in the new pattern #835

Open hellais opened 5 months ago

hellais commented 5 months ago

While we are at it, we only every need to fetch data one year at a time to so we should be able to add constraints about that to it.

hellais commented 5 months ago

So I was looking into this and thinking about this in the context of also improving how the lookup works in the network and domain pages.

From the looks of it, we basically have 4 bits of information that we need for a particular country, network or domain:

  1. The overall measurement_count
  2. The date of the first measurement
  3. The date of the last measurement
  4. For each calendar year, we need a timeseries of count of measurements per day

The first 3 items, we have to load once upon first render, while for 4. we will have to make a new fetch every time the user clicks on a different calendar year.

At the moment in the network pages we are doing this in a way that's a bit suboptimal, in that we load on first render the whole aggregation result since 12 years before the current date: https://github.com/ooni/explorer/blob/master/pages/as/%5Bprobe_asn%5D.js#L221.

This is problematic for 2 reasons:

  1. Once we reach December 2024, this will no longer be accurate, since we would be missing the first OONI measurements based on this filter: https://github.com/ooni/explorer/blob/master/pages/as/%5Bprobe_asn%5D.js#L226
  2. We are delaying the first render by requesting all the calendar datapoints even for years which we have no plan on rendering

My proposal is therefore that we implement 1 new endpoint and discontinue the existing test_coverage and country_overview endpoints that returns:

  1. The overall measurement_count
  2. The date of the first measurement
  3. The date of the last measurement

We can call it measurement_overview and have it take as parameters probe_cc, probe_asn, test_name, domain and it returns: total_measurements, first_measurement_date, last_measurement_date.

For the counts per year, we can continue using the aggregation endpoint, since it already supports fetching data one year at a time.

@majakomel how does this sound?