abcxyz / github-metrics-aggregator

Apache License 2.0
12 stars 3 forks source link

Leech Pipeline - FAILURE status handling #58

Open bradegler opened 1 year ago

bradegler commented 1 year ago

TL;DR

Logs with a FAILURE status are currently filtered out of the query.

We need to figure out a way to mark how many attempts have been made for a particular delivery id preferably without causing an update to the row in BigQuery.

The simplest approach would be to just not write a FAILURE status and let the query try again.

The drawback is that something could be stuck forever in that situation and we wouldn't want to keep processing it. I think a secondary FAILURE table might work that we can join into the main query as WHERE count(failures where delivery_id = x) < 10 or something.

This adds complexity to the write operation though so it requires some thought.

Detailed design

No response

Alternatives considered

No response

Additional information

No response