However, the job runner, when running an app in the narrative, seemed to ignore the exception - it didn't print a stack trace, at least. Instead it logged the following:
Job is done
1615180489.855946:ERROR:jr:WARNING: Module DataFileUtil was already used once for this job. Using cached version: url: https://github.com/kbaseapps/DataFileUtil commit: b640d1ce826f46d1bff497a2f475a4c7de016a18 version: 0.1.4 release: release
INFO:root:Running docker container for image: dockerhub-ci.kbase.us/kbase:datafileutil.b640d1ce826f46d1bff497a2f475a4c7de016a18
INFO:root:About to run 34f56f72-7fcd-11eb-849a-02ac072d09e7 dockerhub-ci.kbase.us/kbase:datafileutil.b640d1ce826f46d1bff497a2f475a4c7de016a18
INFO:root:Container id is b450fe482fc4f5315485edb4aa127b6deff68c644002064bc30230132ec880f3
1615180494.837040:ERROR:jr:WARNING: Module DataFileUtil was already used once for this job. Using cached version: url: https://github.com/kbaseapps/DataFileUtil commit: b640d1ce826f46d1bff497a2f475a4c7de016a18 version: 0.1.4 release: release
INFO:root:Running docker container for image: dockerhub-ci.kbase.us/kbase:datafileutil.b640d1ce826f46d1bff497a2f475a4c7de016a18
INFO:root:About to run 37eeefaa-7fcd-11eb-849a-02ac072d09e7 dockerhub-ci.kbase.us/kbase:datafileutil.b640d1ce826f46d1bff497a2f475a4c7de016a18
INFO:root:Container id is 93eb0680402df0c88d3e2066f64efb77a08df1fcde02701ed59ee21767fae635
1615180499.823481:ERROR:jr:WARNING: Module DataFileUtil was already used once for this job. Using cached version: url: https://github.com/kbaseapps/DataFileUtil commit: b640d1ce826f46d1bff497a2f475a4c7de016a18 version: 0.1.4 release: release
INFO:root:Running docker container for image: dockerhub-ci.kbase.us/kbase:datafileutil.b640d1ce826f46d1bff497a2f475a4c7de016a18
INFO:root:About to run 3ae7dc9e-7fcd-11eb-849a-02ac072d09e7 dockerhub-ci.kbase.us/kbase:datafileutil.b640d1ce826f46d1bff497a2f475a4c7de016a18
INFO:root:Container id is 36ea0f819dd8fcb5a2534e5cafd6775db56841afab80d199cd0ef3413100862e
INFO:sanic.root:Stopping worker [11342]
INFO:sanic.root:Server Stopped
INFO:jr:Logger initialized for 6045b269172cf13980dabe9a
The monitor jobrunner script killed your job. It is possible the node got overloaded with too many jobs! Memory={'total': 33729970176, 'available': 31053275136, 'percent': 7.9, 'used': 2393518080, 'free': 8088121344, 'active': 2393497600, 'inactive': 19168612352, 'buffers': 136212480, 'cached': 23112118272, 'shared': 1810432, 'slab': 3632816128} CPU=26.1
After fixing the bug, the logs ended after the Job is done line.
I would expect the job runner to print the exception and some sort of severe warning message that it couldn't complete the job properly.
I recently introduced a bug such that EE2 would throw an exception from the
finish_job
method becausecatalog_utils.catalog
is not an extant field: https://github.com/kbase/execution_engine2/blob/aecddbf3357eba07c2da92a1538387a2779e6ea1/lib/execution_engine2/sdk/EE2Status.py#L549However, the job runner, when running an app in the narrative, seemed to ignore the exception - it didn't print a stack trace, at least. Instead it logged the following:
After fixing the bug, the logs ended after the
Job is done
line.I would expect the job runner to print the exception and some sort of severe warning message that it couldn't complete the job properly.