davidgasquez / gitcoin-grants-data-portal

🌲 Open source, serverless, and local-first data hub for Gitcoin Grants data!
https://grantsdataportal.xyz/
MIT License
26 stars 3 forks source link

Add Run Summaries to CI #16

Closed DistributedDoge closed 8 months ago

DistributedDoge commented 8 months ago

CI setup allows execution to proceed even though some Dagster jobs have failed.

I would like to use Github Run Summary to make it more obvious that asset failed to materialize without needing to open the log.

image

Solution is to modify equivalent of make run step to produce logfile and parse that in new CI step responsible for parsing logs like so.

Will open PR, but first I want to check if I can capture STEP_SKIPPED this way.

davidgasquez commented 8 months ago

Woah! Having that would be awesome. I was struggling with a similar thing recently in the Filecoin Data Portal since some steps were always failing.

I think the ideal scenario would be to make Dagster return a non-zero exit status as soon as one of the DAG assets fails. This way we don't run extra computations and GitHub actions can fail properly. Have you digged into this?

DistributedDoge commented 8 months ago

I think I got run summary to work, as below. First table is list of failed jobs, second table is assets placed in tables dir.

https://github.com/DistributedDoge/gitcoin-grants-data-portal/actions/runs/7299077671

davidgasquez commented 8 months ago

Beautiful! Love the final tables markdown.

Curious, what do you think about making Dagster fail fast and return a non-zero exit status to GH actions?

DistributedDoge commented 8 months ago

I am a bit torn here. When developing I want to fail fast and make sure everything works.

When fetching data, I am not interested in all the assets so I would rather tolerate failure of non-essential (to me) table than re-run everything.

davidgasquez commented 8 months ago

I'm also torn with this one. THe main issue right now is that if a table fails silently, it won't get updloaded to IPFS thus impacting all downstream dependents (e.g no round_votes.parquet).

I think I'd rather not upload anything to IPFS than miss a table.