Closed rennerocha closed 2 years ago
I like the idea a lot. I do have some feedback for the implementation:
BaseStatsMonitor
→ BaseStatMonitor
, since it is meant to monitor a single stat.threshold
, that seems too arbitrary, instead fail if the setting is missing.Thanks for the feedback @Gallaecio !
Do not provide a default threshold, that seems too arbitrary, instead fail if the setting is missing.
I am considering doing something similar that we have in Django Rest Framework to get serializers, and use it for the threshold, so we can have more of flexibility and are able to have dynamic thresholds if the user needs it:
class BaseStatsMonitor(Monitor):
def get_threshold(self):
self.data.spider.settings.getint(self.threshold_setting)
def test_validate_stats(self):
# breakpoint()
threshold = self.get_threshold()
....
That way, we can set the threshold as a setting, or override get_threshold
to have a more dynamic way to get it (for example, get the number of items in the last job).
I agree with the default value, we should raise NotConfigured
if the threshold is not set.
Problem
A huge number of monitors follows the same pattern: we have one spider execution statistic and want to validate against a desired threshold. If we look the code of some built-in monitors we can see that the code of them are almost the same (for example
ErrorCountMonitor
,ItemCountMonitor
,WarningCountMonitor
and some others).Adding a new monitor is not too complicated, but we end with a lot of almost equal code that reduces the readability of the code and make it possible to introduce errors (because sometimes we do a lot of copy-paste to create our monitors).
Solution
Add a new base monitor class that allows us to create new monitors based on a value in the jobs statistics just defining a new class and a few class arguments. Doing that, we can reduce the amount of code we need to create to add a simple monitor like mentioned before.
Draft of solution
Creating a base stats monitor that allows us to create a new custom monitor with 5 lines of code: