m-lab / etl-gardener

Gardener provides services for maintaining and reprocessing mlab data.
Apache License 2.0
13 stars 5 forks source link

NUTS geographies for annotations of tests originating in the EU #322

Open critzo opened 3 years ago

critzo commented 3 years ago

The NUTS classification is a standard geographic information source developed by the European Commission, representing EU country geographies at three levels:

NUTS 0 is also provided, which is equivalent to a country level geography for EU member countries.

The European Commission publishes shapefiles for NUTS levels at various resolutions, and versioned releases that cover specific years. The two most recent releases of NUTS shapefiles at 1:1 Million resolution have been loaded into the following tables:

2018-01-01 -- 2020-12-31:

2021-01-01 -- :

Joining with M-Lab data

Individual NDT test results may be annotated with metadata from the NUTS geography tables using BigQuery's GIS functions. See truncated query below:


WITH
EU_NUTS_3 AS (
  SELECT * FROM `measurement-lab.geographies.EU_NUTS_3_2016_01m`
),
dl AS (
  SELECT
    date,
    EU_NUTS_3.NUTS_ID AS NUTS_ID,
    EU_NUTS_3.LEVL_CODE AS LEVL_CODE,
    EU_NUTS_3.CNTR_CODE AS CNTR_CODE,
    EU_NUTS_3.NUTS_NAME AS NUTS_NAME,
    NET.SAFE_IP_FROM_STRING(client.IP) AS ip,
    a.MeanThroughputMbps AS mbps
  FROM
    `measurement-lab.ndt.unified_downloads` tests, EU_NUTS_3
  WHERE
    date BETWEEN "2018-01-01" AND "2018-12-31"
    AND client.Geo.country_name IS NOT NULL
    AND client.Geo.country_code IS NOT NULL
    AND client.Geo.country_code != ""
    AND ST_WITHIN(
      ST_GeogPoint(
        client.Geo.longitude,
        client.Geo.latitude
      ), EU_NUTS_3.WKT
    )
),
...

## Suggested NUTS Fields for New Annotations
* NUTS_ID - Nuts1ID, Nuts2ID, Nuts3ID
* NAME_LATN - Nuts1NameLatn, etc.
* NUTS_NAME - Nuts1Name, etc.