HTTPArchive / wptagent

Cross-platform WebPageTest agent
Other
1 stars 0 forks source link

Add `rank` to `parsed_css` #14

Closed tunetheweb closed 2 months ago

tunetheweb commented 2 months ago

@pmeenan I know we looked at adding rank to all.requests and that's gonna take a bit more prep work, but I think we can add rank to all.parsed_css table this month. I just discovered that table isn't clustered at all, so needs recreating anyway, so might as well add rank at the same time.

I already have the historical data prepared in httparchive.scratchspace.parsed_css_bk_2024_09_07 so to recreate to the production table I just need to run the SQL below.

I think these are the changes required on the agent side to prepare that side but you'd know that better than me.

What do you think? Or too short a time to do it this month and we should just wait for next month?

CREATE OR REPLACE TABLE
`httparchive.all.parsed_css`
PARTITION BY date
CLUSTER BY client, rank, is_root_page, page
OPTIONS(require_partition_filter = TRUE)
AS
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-08-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-07-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-06-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-05-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-04-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-03-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-02-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2024-01-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2023-12-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2023-11-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2023-10-01";

INSERT INTO `httparchive.all.parsed_css`
SELECT *
FROM `httparchive.scratchspace.parsed_css_bk_2024_09_07`
WHERE date = "2023-09-01";
pmeenan commented 2 months ago

LGTM. Once the all table is ready I'll verify the proto and merge. FWIW, the json schema also needs to be updated and the build needs to be run to generate the python bindings but I can take care of that after it is merged.