mozilla-releng / signingscript

Signing script to run in scriptworker.
1 stars 11 forks source link

Do not mark task green if `sign()` fails #128

Closed rail closed 5 years ago

rail commented 5 years ago

I was testing the new GCP workers and hit this issue due to network timeout to the autograph server (some netlows still should be open). The task was marked as green while the logs stated that something went wrong:

2019-08-21 06:04:52,896 - scriptworker.client - DEBUG - Task is validated against this schema: {'title': 'Taskcluster signing task minimal schema', 'type': 'object', 'properties': {'dependencies': {'type': 'array', 'minItems': 1, 'uniqueItems': True, 'items': {'type': 'string'}}, 'scopes': {'type': 'array', 'minItems': 1, 'uniqueItems': True, 'items': {'type': 'string'}}, 'payload': {'type': 'object', 'properties': {'upstreamArtifacts': {'type': 'array', 'items': {'type': 'object', 'properties': {'taskType': {'type': 'string'}, 'taskId': {'type': 'string'}, 'formats': {'type': 'array', 'uniqueItems': True, 'items': {'type': 'string'}}, 'paths': {'type': 'array', 'minItems': 1, 'uniqueItems': True, 'items': {'type': 'string'}}}, 'required': ['taskId', 'taskType', 'paths', 'formats']}, 'minItems': 1, 'uniqueItems': True}}, 'required': ['upstreamArtifacts']}}, 'required': ['scopes', 'payload']}
2019-08-21 06:04:52,919 - asyncio - DEBUG - Using selector: EpollSelector
2019-08-21 06:04:52,923 - signingscript.utils - INFO - Loading signing server config from /app/configs/passwords.json
2019-08-21 06:04:52,924 - signingscript.utils - INFO - Signing server config loaded from /app/configs/passwords.json
2019-08-21 06:04:52,925 - signingscript.utils - INFO - mkdir /app/workdir/public/build
2019-08-21 06:04:52,925 - signingscript.utils - INFO - Copying /app/workdir/cot/YbK52RzITp6VDnh72cIlyQ/public/build/source.tar.xz to /app/workdir/public/build/source.tar.xz
2019-08-21 06:04:54,593 - signingscript.script - INFO - signing public/build/source.tar.xz
2019-08-21 06:04:54,596 - signingscript.task - INFO - sign(): Signing /app/workdir/public/build/source.tar.xz with autograph_gpg...
Automation Error: python exited with signal -9
escapewindow commented 5 years ago

Weird. Do we know which python that is? signingscript?

escapewindow commented 5 years ago

I think I know what this is. scriptworker.task.worst_level assumes that we want to exit with the largest of the statuses... 0 if successful, otherwise an exit code of 1 will be failure, 2 for worker-shutdown, etc. But the status here is -9, so when we upload artifacts, we compare -9 vs 0, and the worst_level (according to the logic) is 0.

I think we should either translate -9 to a positive int (is this wrap-around from 255?) or change the worst_level function (maybe compare absolute values of the exit codes?) or make worst_level prefer non-zero, or?

Edit: maybe we should treat non-zero exit code as a 1 status, and we should either return that, or raise an exception that corresponds to exit code 1 .