The algo has somewhat randomly started getting a lot of:
Exception: {'statusCode': 500, 'errorCode': 'SERVER_ERROR', 'message': 'Internal server error'}
These are raised by the gql library that seer-py uses internally and IDK, maybe they originate from our GQL server? Or further downstream as rate limits from S3? Opinions vary and I'm unclear. See also https://github.com/seermedical/seer-algo/issues/113.
In any case, pretty sure seer-py isn't catching these or retrying, so this is a tiny change to add this as an error to retry for. In the best case maybe this will largely fix things, and at the worst, will probably just drag things out for a few minutes before an instance turns into a zombie. Probably a worthwhile first step before taking a more drastic measure like switching to S3 direct?
P.s. my pre-commit YAPFed a bunch of bad indentation that had crept into the file, sorry that obscures the actual meaningful change.
The algo has somewhat randomly started getting a lot of:
Exception: {'statusCode': 500, 'errorCode': 'SERVER_ERROR', 'message': 'Internal server error'}
These are raised by the
gql
library thatseer-py
uses internally and IDK, maybe they originate from our GQL server? Or further downstream as rate limits from S3? Opinions vary and I'm unclear. See also https://github.com/seermedical/seer-algo/issues/113.In any case, pretty sure
seer-py
isn't catching these or retrying, so this is a tiny change to add this as an error to retry for. In the best case maybe this will largely fix things, and at the worst, will probably just drag things out for a few minutes before an instance turns into a zombie. Probably a worthwhile first step before taking a more drastic measure like switching to S3 direct?P.s. my
pre-commit
YAPFed a bunch of bad indentation that had crept into the file, sorry that obscures the actual meaningful change.