At the moment the expert is called every ~200 minutes, with a message like this:
Exception while running check ShifterOnShift: (datetime.datetime(2017, 7, 16, 7, 50), <object object at 0x7fd04a087080>, ('db', None))
The complete traceback looks like this
2017-07-16 10:55:01,688 - custos.checks.FactIntervalCheck - ERROR - Exception while running check
Traceback (most recent call last):
File "/opt/conda/lib/python3.5/site-packages/custos/checks/__init__.py", line 82, in wrapped_check
self.check(*args, **kwargs)
File "/opt/conda/lib/python3.5/site-packages/shifthelper/checks.py", line 38, in check
if all([f() for f in self.checklist]):
File "/opt/conda/lib/python3.5/site-packages/shifthelper/checks.py", line 38, in <listcomp>
if all([f() for f in self.checklist]):
File "/opt/conda/lib/python3.5/site-packages/wrapt/wrappers.py", line 522, in __call__
args, kwargs)
File "/opt/conda/lib/python3.5/site-packages/shifthelper/debug_log_wrapper.py", line 9, in log_call_and_result
result = wrapped(*args, **kwargs)
File "/opt/conda/lib/python3.5/site-packages/shifthelper/conditions.py", line 313, in is_nobody_on_shift
get_current_shifter()
File "/opt/conda/lib/python3.5/site-packages/shifthelper/tools/shift.py", line 30, in get_current_shifter
full_shifter_info = retrieve_shifters_from_calendar(db=db)
File "/opt/conda/lib/python3.5/site-packages/shifthelper/tools/shift.py", line 48, in retrieve_shifters_from_calendar
calendar_entries = retrieve_calendar_entries(time, db=db)
KeyError: (datetime.datetime(2017, 7, 16, 7, 50), <object object at 0x7fd04a087080>, ('db', None))
As one can see, this function call is cached, because I did not want to hit the DB with requests a couple of ten times per check interval (currently every 2 minutes).
But this optimization was most certainly premature. The DB is a local copy of the fact DB on the SH node, so there is not much network in between. Also MySQL DBs typically cache the last few queries (https://dev.mysql.com/doc/refman/5.7/en/query-cache.html) so there is no need to cache this inside the SH itself.
We do not understand exactly, why this cache miss happens right now every 200 minutes, but @MaxNoe found a this related python bug report https://bugs.python.org/issue28969
So as a remedy for this behaviour I propose to simply remove this caching.
At the moment the expert is called every ~200 minutes, with a message like this:
The complete traceback looks like this
So it comes from here: https://github.com/fact-project/shifthelper/blob/7aabafd3b747290a71abba321322e8715012a2f2/shifthelper/tools/shift.py#L61
As one can see, this function call is cached, because I did not want to hit the DB with requests a couple of ten times per check interval (currently every 2 minutes).
But this optimization was most certainly premature. The DB is a local copy of the fact DB on the SH node, so there is not much network in between. Also MySQL DBs typically cache the last few queries (https://dev.mysql.com/doc/refman/5.7/en/query-cache.html) so there is no need to cache this inside the SH itself.
We do not understand exactly, why this cache miss happens right now every 200 minutes, but @MaxNoe found a this related python bug report https://bugs.python.org/issue28969
So as a remedy for this behaviour I propose to simply remove this caching.