scrapinghub / spidermon

Scrapy Extension for monitoring spiders execution.
https://spidermon.readthedocs.io
BSD 3-Clause "New" or "Revised" License
528 stars 96 forks source link

Built-in periodic monitor to check if the amount of items increases #324

Closed curita closed 1 year ago

curita commented 2 years ago

Spidermon could have a periodic monitor that validates that the amount of items (tracked in the stat item_scraped_count) increases after every check.

This will alert us of jobs that are stuck without returning any new items but aren't necessarily throwing any issues.

There could be a threshold to indicate by how much percentage the item count is expected to increase, to alert us for jobs that are working slower than expected as well. There could be two possible settings I imagine too, one absolute one (meaning it will check that there's at least x new items after every check) and a relative one (checking the item increase is at least y%), with the possibility of defining one or both.

We have implemented something similar in one of our projects [private], that could be taken as a starting point after stripping the project related code.