nextstrain / status

Nextstrain status pages
https://nextstrain.github.io/status/
1 stars 0 forks source link

Pathogen workflow status page includes non-default branch runs #8

Closed joverlee521 closed 2 months ago

joverlee521 commented 2 months ago

The shown run is a test run in the zika repo that was not run on the main branch.

Screenshot 2024-04-10 at 10 14 41 AM

Looks like we should only be including runs that are from the default branch

https://github.com/nextstrain/status/blob/507429a7195f1d52eb6d32343016115851f02afa/pathogen-workflows.sql#L122-L124

joverlee521 commented 2 months ago

Huh, the execution plan does not even include the head_branch filter

Execution plan ``` Aggregate (cost=20001330465662.70..20001330465662.71 rows=1 width=32) CTE repository -> Nested Loop (cost=0.01..20000002500000.01 rows=100000000 width=64) -> Foreign Scan on net_http_request (cost=0.00..20000000000000.00 rows=1000000 width=200) Filter: (url = 'https://api.github.com/orgs/nextstrain/repos?per_page=100&sort=full_name'::text) -> Function Scan on jsonb_array_elements r (cost=0.01..1.01 rows=100 width=32) CTE workflow_id -> HashSetOp Except (cost=0.00..3890006.12 rows=40000 width=44) -> Append (cost=0.00..3885006.12 rows=1000001 width=44) -> Subquery Scan on "*SELECT* 1" (cost=0.00..3880006.09 rows=1000000 width=44) -> Nested Loop (cost=0.00..3870006.09 rows=1000000 width=40) -> Subquery Scan on "ANY_subquery" (cost=0.00..3750002.09 rows=200 width=32) -> HashSetOp Except (cost=0.00..3750000.09 rows=200 width=36) -> Append (cost=0.00..3500000.08 rows=100000003 width=36) -> Subquery Scan on "*SELECT* 1_1" (cost=0.00..3000000.00 rows=100000000 width=36) -> CTE Scan on repository (cost=0.00..2000000.00 rows=100000000 width=32) -> Subquery Scan on "*SELECT* 2_1" (cost=0.00..0.07 rows=3 width=36) -> Values Scan on "*VALUES*" (cost=0.00..0.04 rows=3 width=32) -> Foreign Scan on github_workflow (cost=0.00..600.00 rows=2 width=300) Filter: ((path ~~ '.github/workflows/%'::text) AND (repository_full_name = "ANY_subquery".repository_full_name)) -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=44) -> Result (cost=0.00..0.01 rows=1 width=36) CTE workflow -> Nested Loop (cost=0.00..24006200.00 rows=1000000 width=160) -> CTE Scan on workflow_id (cost=0.00..800.00 rows=40000 width=40) -> Foreign Scan on github_workflow github_workflow_1 (cost=0.00..600.00 rows=1 width=600) Filter: ((workflow_file_content_json @@ '($."jobs".*."uses" starts with "nextstrain/.github/.github/workflows/pathogen-repo-build.yaml@")'::jsonpath) AND (name <> 'CI'::text) AND (workflow_id.id = id) AND (workflow_id.repository_full_name = repository_full_name)) CTE run -> WindowAgg (cost=1300062373.23..1300066435.73 rows=125000 width=408) -> Sort (cost=1300062373.23..1300062685.73 rows=125000 width=384) Sort Key: workflow.repository_full_name, workflow.workflow_id, run_1.run_number DESC -> Nested Loop (cost=0.00..1300030000.00 rows=125000 width=384) -> CTE Scan on workflow (cost=0.00..20000.00 rows=1000000 width=160) -> Foreign Scan on github_actions_repository_workflow_run run_1 (cost=0.00..1300.00 rows=1 width=1300) Filter: ((created_at >= '2024-02-13 21:50:28+00'::timestamp with time zone) AND (workflow.workflow_id = workflow_id) AND (workflow.repository_full_name = repository_full_name) AND (age((CURRENT_DATE)::timestamp with time zone, created_at) <= 'P90D'::interval)) -> CTE Scan on run (cost=0.00..2812.50 rows=41667 width=24) Filter: (relative_workflow_run_number <= 30) ```
joverlee521 commented 2 months ago

Hmm, when I try to add a join with the repository CTE, I get an error

ERROR:  rpc error: code = Internal desc = github: rpc error: code = Internal desc = 'List' call for table 'github_actions_repository_workflow_run' is missing 1 required qual: column:'repository_full_name' operator: =

I don't understand what's going on under the hood in steampipe, so I'm just going to work around it with an additional CTE that filters to the default branch.

tsibley commented 2 months ago

Odd! It definitely worked as a filter when I committed it. I'd bet some Steampipe change between versions broke it. It was treading on unstable ground a bit as a hack to work around a joining issue (which I never followed up with and wrote up…). I wonder if the join approach would work now. Hmm.

tsibley commented 2 months ago

Maybe also a reason to pin the Steampipe version.