astropy / astropy

Astronomy and astrophysics core library
https://www.astropy.org
BSD 3-Clause "New" or "Revised" License
4.45k stars 1.78k forks source link

CI: failure of coverage upload doesn't change job status #16379

Open bsipocz opened 6 months ago

bsipocz commented 6 months ago

Recently the coverage status upload ran into a github rate limit, but this didn't change the status of the job from green to red, instead generated false coverage % that sent us on a bit of a wild goose chase. Ideas how to improve CI:

bsipocz commented 6 months ago

There is a long standing upstream feature request for this here: https://github.com/codecov/codecov-action/issues/926

It also links to many implemented workarounds, so I strongly suggest to adopt one of those instead (e.g. wrapping it all in a retry action) of waiting for an upstream, out-of-the box fix.

This should be an easy starter issue for a newcomer.

pllim commented 6 months ago

cc @rosteen since he was just struggling with this same problem over at jdaviz.

pllim commented 6 months ago

Thanks for the info, @bsipocz ! I will investigate.

pllim commented 6 months ago

Looking at the linked issue, I see two possible ways forward. Example implementations of each:

  1. https://github.com/nextcloud/appstore/pull/1294 -- This does not require third-party Action. I think maintainer re-triggers only the upload part when upload fails. No automatic retry but Actions are guaranteed to be supported long time and no need to re-run the test suite.
  2. https://github.com/tagatac/bagoup/pull/57 -- This uses a third-party Action. Looks super convenient but there is no guarantee upload would be successful within the attempts (which means you have to rerun everything again if that fails anyway) and not sure how long this Action is going to be maintained going forward.

I am leaning towards Option 1 for both astropy and jdaviz. What do you think, @bsipocz and @rosteen ?

bsipocz commented 6 months ago

The first one is fine, I think basically anything is fine that avoids rerunning the tests because of an upload error.

pllim commented 6 months ago

Huh, I think for astropy, I actually have to patch the OpenAstronomy workflow (https://github.com/OpenAstronomy/github-actions-workflows/pull/199), but not for jdaviz (https://github.com/spacetelescope/jdaviz/pull/2865).

pllim commented 5 months ago

@astrofrog reverted my attempt to address this at https://github.com/OpenAstronomy/github-actions-workflows/pull/206

Also it never quite work because of a token failure that I cannot understand: #16535

So, for now, just check the logs diligently if you want to know if the upload really happened or not.