HTTPArchive / tech-report-apis

APIs for the HTTP Archive Technology Report
Apache License 2.0
2 stars 0 forks source link

Page Weight API #10

Open rviscomi opened 1 year ago

rviscomi commented 1 year ago
rviscomi commented 1 year ago

Query

CREATE TEMPORARY FUNCTION GET_PAGE_WEIGHT(
  records ARRAY<STRUCT<
    client STRING,
    total INT64,
    js INT64,
    images INT64
  >>
) RETURNS ARRAY<STRUCT<
  name STRING,
  mobile STRUCT<
    median_bytes INT64
  >,
  desktop STRUCT<
    median_bytes INT64
  >
>> LANGUAGE js AS '''
const METRICS = ['total', 'js', 'images'];

// Initialize the page weight map.
const pageWeight = Object.fromEntries(METRICS.map(metricName => {
  return [metricName, {name: metricName}];
}));

// Populate each client record.
records.forEach(record => {
  METRICS.forEach(metricName => {
    pageWeight[metricName][record.client] = {median_bytes: record[metricName]};
  });
});

return Object.values(pageWeight);
''';

SELECT
  date,
  app AS technology,
  rank,
  geo,
  GET_PAGE_WEIGHT(ARRAY_AGG(STRUCT(
    client,
    median_bytes_total,
    median_bytes_js,
    median_bytes_image
  ))) AS pageWeight
FROM
  `httparchive.core_web_vitals.technologies`
WHERE
  date = '2023-07-01'
GROUP BY
  date,
  app,
  rank,
  geo

Example record

[{
  "date": "2023-07-01",
  "technology": "WordPress",
  "rank": "ALL",
  "geo": "ALL",
  "pageWeight": [{
    "name": "total",
    "mobile": {
      "median_bytes": "2231859"
    },
    "desktop": {
      "median_bytes": "2600099"
    }
  }, {
    "name": "js",
    "mobile": {
      "median_bytes": "563278"
    },
    "desktop": {
      "median_bytes": "652651"
    }
  }, {
    "name": "images",
    "mobile": {
      "median_bytes": "890780"
    },
    "desktop": {
      "median_bytes": "1048110"
    }
  }]
}]
maceto commented 1 year ago

The endpoint was added to the API see docs https://github.com/HTTPArchive/tech-report-apis/blob/main/README.md#get-page-weight

Working on the Data script

sarahfossheim commented 1 year ago

@maceto could you format the data the same as the other endpoints?

Current behavior

Currently it returns the data the same as in the JSON files:

[{
  "date": "2023-07-01",
  "technology": "WordPress",
  "rank": "ALL",
  "geo": "ALL",
  "client": "desktop",
  "median_bytes_total": "2600099",
  "median_bytes_js": "652651",
  "median_bytes_image": "1048110"
}]

Desired behavior

Something similar to the other endpoints, see for example: https://github.com/HTTPArchive/tech-report-apis/issues/4

Example format:

[{
  "date": "2023-07-01",
  "technology": "WordPress",
  "rank": "ALL",
  "geo": "ALL",
  "pageWeight": [{ // Or "page-weight"
    "name": "total", // Total, JS, Image
    "mobile": {
        "median_bytes": 1234
    },
    "desktop": {
        "median_bytes": 567
    },
  }]
}]