When an uncaught exception is raised in pyorbital, for example, due to https://github.com/pytroll/pyorbital/issues/74, gatherer stops gathering. The last sign of life in my gatherer logfile is:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 397, in run
self.process(msg)
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 111, in add_file
self._do(pathname)
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 107, in _do
Trigger._do(self, mda)
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/trigger.py", line 86, in _do
res = collector(metadata.copy())
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/region_collector.py", line 65, in __call__
return self.collect(granule_metadata)
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pytroll_collectors/region_collector.py", line 147, in collect
granule_pass = Pass(platform, start_time, end_time,
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/trollsched/satpass.py", line 176, in __init__
self.orb = orbital.Orbital(satellite, line1=tle1, line2=tle2)
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/orbital.py", line 164, in __init__
self.tle = tlefile.read(satellite, tle_file=tle_file,
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/tlefile.py", line 106, in read
return Tle(platform, tle_file=tle_file, line1=line1, line2=line2)
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/tlefile.py", line 154, in __init__
self._read_tle()
File "/opt/pytroll/pytroll_inst/miniconda3/envs/pytroll-py38/lib/python3.8/site-packages/pyorbital/tlefile.py", line 200, in _read_tle
urls = (max(glob.glob(os.environ["TLES"]),
ValueError: max() arg is an empty sequence
Despite the exception, the daemon is still running, as shown by supervisorctl:
This is the worst of both worlds: It continues running, so it's not restarted by supervisorctl, but it's not doing anything anymore, so production has stopped.
The gatherer should catch and log exceptions raised downstream (such as by pyorbital), then try to resume gathering if at all possible.
When an uncaught exception is raised in pyorbital, for example, due to https://github.com/pytroll/pyorbital/issues/74, gatherer stops gathering. The last sign of life in my gatherer logfile is:
Despite the exception, the daemon is still running, as shown by
supervisorctl
:This is the worst of both worlds: It continues running, so it's not restarted by
supervisorctl
, but it's not doing anything anymore, so production has stopped.The gatherer should catch and log exceptions raised downstream (such as by pyorbital), then try to resume gathering if at all possible.