Open boegel opened 2 years ago
Same problem happened with https://github.com/easybuilders/easybuild-easyconfigs/pull/14700, so the issue is not due to long-running installations.
Maybe it has something to do with also starting to run the bot on the jsc-zen2
virtual cluster, and we're hitting GitHub rate limits by generating too many requests?
cc @SebastianAchilles
I'm seeing this pop up more frequently again on jsc-zen2
. Not sure why we're seeing it there more often...
Maybe we should let EasyBuild try multiple times to upload the test report, before giving up?
Something like (in overall_test_report
):
for attempt in range(5):
try:
self.log.debug("Attempt #%d to post test report...", attempt)
post_pr_test_report(...)
break
except Exception as err:
self.log.warning("Posting test report failed: %s" % err)
time.sleep(5 * attempt)
Just had this happen on my build of https://github.com/easybuilders/easybuild-easyconfigs/pull/16631
edit, I got a 500 error instead of a 403 - so I'm not sure if it is the same or not:
Adding comment to easybuild-easyconfigs issue #16631: 'Test report by @branfosj
**SUCCESS**
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0105u36b.bear.cluster - Linux RHEL 8.7, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/d41ef5a2de2c1db68ec51e24420f0a0a for a full test report.'
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/main.py", line 601, in <module>
main()
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/main.py", line 580, in main
test_report_msg = overall_test_report(ecs_with_res, len(paths), overall_success, success_msg, init_session_state)
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/tools/testing.py", line 378, in overall_test_report
success)
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/tools/testing.py", line 340, in post_pr_test_report
post_comment_in_issue(pr_nr, comment, account=pr_target_account, repo=pr_target_repo, github_user=github_user)
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/tools/github.py", line 635, in post_comment_in_issue
status, data = pr_url.comments.post(body={'body': txt})
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/base/rest.py", line 137, in post
return self.request(self.POST, url, json.dumps(body), headers, content_type='application/json')
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/base/rest.py", line 174, in request
conn = self.get_connection(method, url, body, headers)
File "/rds/projects/2017/branfosj-rse/easybuild/src/easybuild-framework/easybuild/base/rest.py", line 215, in get_connection
connection = self.opener.open(request)
File "/usr/lib64/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib64/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error
@branfosj That looks like a different problem, but the solution should be similar: just try again, a couple of times, with some delay in between, if uploading the test report failed for whatever reason...
@branfosj That looks like a different problem, but the solution should be similar: just try again, a couple of times, with some delay in between, if uploading the test report failed for whatever reason...
I believe there are two places this can fail and the difference in the error is related:
create_gist
- i.e. that we fail to create a gist with the build report informationpost_comment_in_issue
- i.e. that we fail to add a message to the PR about the buildIf we add a loop around calling post_pr_test_report
I suspect that we will do some of the work multiple times, so we may end up with several gists or comments created - especially as we can be working with multiple PRs at one time.
We should catch the 403/500, and just try again (a couple of times if needed), while trying to make sure we don't do the same thing twice. This is now happening more often, and it's getting pretty annoying...
We're currently seeing trouble with hitting rate limits for the boegelbot
GitHub account, with the same symptoms:
An exception occurred when trying to use token for authenticated GitHub access: HTTP Error 403: Forbidden
I've seen this occur several times now when
boegelbot
is trying to upload a test report for a PR fromgeneroso
, in particular for a long-running test like Trilinos in #14547...