ropensci-review-tools / dashboard

Dashboard for rOpenSci software peer-review
https://ropensci-review-tools.github.io/dashboard/
MIT License
3 stars 0 forks source link

Workflow failures #25

Open mpadge opened 1 month ago

mpadge commented 1 month ago

Issue to ping dev team on workflow failures. Note that the workflow needs an external token for the last time which has write privileges here, and full read privileges for the ropensci org (to read team memberships). The current token set in this repo is a personal one from me.

mpadge commented 1 month ago

@maelle @mpadge The dashboard workflow has failed.

maelle commented 1 month ago

should we only be notified when the dashboard deploy fails, as opposed to R CMD check?

mpadge commented 1 month ago

It does only ping from deploy fails, and that's the first one that has worked, which is good. The problem is slack which I don't think can be resolved. Even the lowest tier should allow at least 1 query per minute, yet this fails intermittently on one every day or so. I'm just going to ignore it for now, and manually re-run the workflow each time it fails.

maelle commented 1 month ago

aaah sorry I had missed that failure, I saw the R CMD check one. Could the issue comment link to the log directly?

mpadge commented 1 month ago

Done. Can you please keep an eye on any other repos that may need this action? Adding it just requires these lines: https://github.com/ropensci-review-tools/dashboard/blob/835f304c3b982fa6777b17121c1d1929b9eacfd9/.github/workflows/publish.yaml#L54-L60 (And personal token required, alas.)

mpadge commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8705894152.

mpadge commented 1 month ago

I'm going to close this issue; pings will still be delivered here regardless

mpadge commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8707293964.

mpadge commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8720665626.

mpadge commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8739196068.

maelle commented 1 month ago

Should the comment be posted as a bot? It makes me laugh that you're talking to yourself in these comments @mpadge

mpadge commented 1 month ago

Yes, but it would then have to be the review bot. It needs access to ropensci org, to get editor data. But I probably should update to the bot token. We then need to document that somewhere.

mpadge commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8746611971.

maelle commented 1 month ago

No it could be GitHub Actions, if you give the workflows write access in this repo.

mpadge commented 1 month ago

That's not enough, alas. The key needs admin access to ropensci org for everything to work. But of course i could just use the standard key for this final comment step, and that would resolve me-talking-to-myself anyway. Yeah, good idea me 😜

mpadge commented 1 month ago

Re-opening because the standard GH workflow key associated with this repo does not have access to run the cli. It errors with:

GraphQL: Resource not accessible by integration (addComment) Error: Process completed with exit code 1.

So still need an explicit key. I'll use the ropensci-review-bot key, but also need to clearly document that somewhere.

ropensci-review-bot commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8834664526.

ropensci-review-bot commented 1 month ago

@maelle @mpadge The dashboard workflow has failed at https://github.com/ropensci-review-tools/dashboard/actions/runs/8835165921.

mpadge commented 1 month ago

Reverted to personal token. Tokens need to be from somebody with "owner" roles both here and in ropensci, which even bot doesn't have.

maelle commented 2 weeks ago
Quitting from lines  at lines 373-377 [get-ed-dat-history] (history.qmd)
Error in `httr2::req_perform()`:
! HTTP 429 Too Many Requests.
Backtrace:
 1. dashboard::editor_status(quiet = TRUE, aggregation_period = aggregation_period)
 2. dashboard::editor_vacation_status(airtable_base_id)
 3. dashboard:::get_slack_editors_status()
 4. httr2::req_perform(req)
mpadge commented 2 weeks ago

Yeah, that's always the failure, yet it is the worst case of (1) intermittent; and (2) not reproducible. My current approach is just to look at the logs each time a failure is pinged here, and ignore all instances of that error :frowning:

maelle commented 2 weeks ago

not any way to rate limit / retry?

mpadge commented 2 weeks ago

Yeah, i guess that would be a good idea. I'll re-open and catch errors from the slack call. Feel free to PR if you have time. The offending lines are https://github.com/ropensci-review-tools/dashboard/blob/main/R/editors-slack.R#L32-L33 I guess a req_retry() would be the best, and lines to simply return NULL on any errors.

mpadge commented 2 weeks ago

@maelle 429 errors now caught in req_retry 3 times, plus additional processing lines for failures. I'll leave this open for a while to make sure no more failures appear, and then (hopefully) close again. Thanks!