scrapinghub / spidermon

Scrapy Extension for monitoring spiders execution.
https://spidermon.readthedocs.io
BSD 3-Clause "New" or "Revised" License
526 stars 94 forks source link

Error when threshold values are set via ScrapyCloud settings #374

Closed marcosmadr closed 1 year ago

marcosmadr commented 1 year ago

The following error occurs when the spider is running on ScrapyCloud and the monitor threshold value is set via ScrapyCloud settings. The threshold retrieved from ScrapyCloud settings is a string, while the value retrieved from the job stats is an integer, therefore the comparisson fails.

======================================================================
ERROR: Extracted Items Monitor/test_stat_monitor
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/app/python/lib/python3.10/site-packages/spidermon/contrib/scrapy/monitors.py", line 144, in test_stat_monitor
    assertion_method(
  File "/usr/local/lib/python3.10/unittest/case.py", line 1248, in assertGreaterEqual
    if not a >= b:
TypeError: '>=' not supported between instances of 'int' and 'str'

If the monitor threshold is set only in the code (and not in ScrapyCloud), the value is retrieved as integer/float and no error is threw.

It seems it started when this PR was merged.

The method used to get the threshold value is get(), while it was using getint() before. It affects only monitors that are inheriting from the BaseStatMonitor class and don't implement a custom get_threshold() method.

mrwbarg commented 1 year ago

Looking into this.