HTTPArchive / tech-report-apis

APIs for the HTTP Archive Technology Report
Apache License 2.0
2 stars 0 forks source link

Core Web Vitals Report #3

Open maceto opened 1 year ago

maceto commented 1 year ago

Could you describe the origin/source of this data?

[
    {
        date: '2020-01-01',
        technology: 'Wordpress',
        vitals: [
            {
                name: 'overall',
                tested: 10,
                desktop: {
                    good_number: 1,
                    good_percentage: 0.1, // Can be calculated in FE too, don't think it's super necessary that the % comes from the api as long as we have the eligible for testing nr to divide it with
                },
                mobile: {
                    good_number: 3,
                    // good_percentage: 0.3,
                },
                across_dataset: {
                    good_number: 3,
                },
            },
            {
                name: 'FID',
                tested: 9,
                desktop: {
                    good_number: 3,
                    // good_percentage: 0.33, 
                },
                mobile: {
                    good_number: 2,
                    // good_percentage: 0.22,
                },
            },
            ...
        ],
    }
]

The goal is to create a script to query this data from BQ transform and save it in Firestore.

sarahfossheim commented 1 year ago

Side note about the example structure used: after working with the current JSON files (https://cdn.httparchive.org/reports/cwvtech/ALL/ALL/jQuery.json), I realize the tested/eligible value should be on the client level, so for example:

{
    name: 'overall',
    desktop: {
        good_values: 1,
        tested: 10,
    },
    mobile: {
        good_values: 3,
        tested: 9,
    },
    ...
},
rviscomi commented 1 year ago

Query

CREATE TEMPORARY FUNCTION GET_VITALS(
  records ARRAY<STRUCT<
      client STRING,
      origins_with_good_fid INT64,
      origins_with_good_cls INT64,
      origins_with_good_lcp INT64,
      origins_with_good_fcp INT64,
      origins_with_good_ttfb INT64,
      origins_with_good_inp INT64,
      origins_with_any_fid INT64,
      origins_with_any_cls INT64,
      origins_with_any_lcp INT64,
      origins_with_any_fcp INT64,
      origins_with_any_ttfb INT64,
      origins_with_any_inp INT64,
      origins_with_good_cwv INT64,
      origins_eligible_for_cwv INT64
  >>
) RETURNS ARRAY<STRUCT<
  name STRING,
  desktop STRUCT<
    good_number INT64,
    tested INT64
  >,
  mobile STRUCT<
    good_number INT64,
    tested INT64
    >
>> LANGUAGE js AS '''
const METRIC_MAP = {
  overall: ['origins_with_good_cwv', 'origins_eligible_for_cwv'],
  LCP: ['origins_with_good_lcp', 'origins_with_any_lcp'],
  CLS: ['origins_with_good_cls', 'origins_with_any_cls'],
  FID: ['origins_with_good_fid', 'origins_with_any_fid'],
  FCP: ['origins_with_good_fcp', 'origins_with_any_fcp'],
  TTFB: ['origins_with_good_ttfb', 'origins_with_any_ttfb'],
  INP: ['origins_with_good_inp', 'origins_with_any_inp']
};

// Initialize the vitals map.
const vitals = Object.fromEntries(Object.keys(METRIC_MAP).map(metricName => {
  return [metricName, {name: metricName}];
}));

// Populate each client record.
records.forEach(record => {
  Object.entries(METRIC_MAP).forEach(([metricName, [good_number, tested]]) => {
    vitals[metricName][record.client] = {good_number: record[good_number], tested: record[tested]};
  });
});

return Object.values(vitals);
''';

SELECT
  date,
  app AS technology,
  rank,
  geo,
  GET_VITALS(ARRAY_AGG(STRUCT(
    client,
    origins_with_good_fid,
    origins_with_good_cls,
    origins_with_good_lcp,
    origins_with_good_fcp,
    origins_with_good_ttfb,
    origins_with_good_inp,
    origins_with_any_fid,
    origins_with_any_cls,
    origins_with_any_lcp,
    origins_with_any_fcp,
    origins_with_any_ttfb,
    origins_with_any_inp,
    origins_with_good_cwv,
    origins_eligible_for_cwv
  ))) AS vitals
FROM
  `httparchive.core_web_vitals.technologies`
WHERE
  date = '2023-07-01'
GROUP BY
  date,
  app,
  rank,
  geo

Example record

{
  "date": "2023-07-01",
  "technology": "WordPress",
  "rank": "ALL",
  "geo": "ALL",
  "vitals": [{
    "name": "overall",
    "desktop": {
      "good_number": "739420",
      "tested": "1971445"
    },
    "mobile": {
      "good_number": "1070555",
      "tested": "3292209"
    }
  }, {
    "name": "LCP",
    "desktop": {
      "good_number": "1078119",
      "tested": "1972903"
    },
    "mobile": {
      "good_number": "1295035",
      "tested": "3300870"
    }
  }, {
    "name": "CLS",
    "desktop": {
      "good_number": "1297985",
      "tested": "2032713"
    },
    "mobile": {
      "good_number": "2602281",
      "tested": "3399202"
    }
  }, {
    "name": "FID",
    "desktop": {
      "good_number": "1440310",
      "tested": "1440661"
    },
    "mobile": {
      "good_number": "2145622",
      "tested": "2210823"
    }
  }, {
    "name": "FCP",
    "desktop": {
      "good_number": "970150",
      "tested": "2040628"
    },
    "mobile": {
      "good_number": "1037134",
      "tested": "3422903"
    }
  }, {
    "name": "TTFB",
    "desktop": {
      "good_number": "705015",
      "tested": "2005349"
    },
    "mobile": {
      "good_number": "690877",
      "tested": "3229472"
    }
  }, {
    "name": "INP",
    "desktop": {
      "good_number": "1589573",
      "tested": "1614185"
    },
    "mobile": {
      "good_number": "1691019",
      "tested": "2434145"
    }
  }]
}
rviscomi commented 1 year ago

Note that across_dataset is omitted as we only have data at desktop/mobile granularity.

maceto commented 1 year ago

@rviscomi, should we have any mandatory param for this endpoint?

rviscomi commented 1 year ago

@sarahfossheim

sarahfossheim commented 1 year ago

Yes, those make sense 👍🏻

maceto commented 1 year ago

Example of how to consume this endpoint

curl --request GET \
  --url 'https://dev-gw-2vzgiib6.ue.gateway.dev/v1/cwv?geo=Uruguay&technology=["DomainFactory"]&rank=ALL'
maceto commented 1 year ago

@rviscomi @sarahfossheim, all the changes discussed are already deployed.

New URL https://dev-gw-2vzgiib6.uk.gateway.dev/v1/cwv

Documentation: https://github.com/HTTPArchive/tech-report-apis#get-cwv