scrapinghub / spidermon

Scrapy Extension for monitoring spiders execution.
https://spidermon.readthedocs.io
BSD 3-Clause "New" or "Revised" License
530 stars 96 forks source link

Add tests for item validation pipeline #414

Closed VMRuiz closed 1 year ago

VMRuiz commented 1 year ago

The current spidermon behavior for adding _validation field to items seems to be a bit controversial: https://github.com/scrapinghub/spidermon/issues/379

To make easier to track changes, I have implemented some additional tests that covers the current logic.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage has no change and project coverage change: +0.12% :tada:

Comparison is base (a75cf97) 78.52% compared to head (1347209) 78.64%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #414 +/- ## ========================================== + Coverage 78.52% 78.64% +0.12% ========================================== Files 73 73 Lines 3152 3152 Branches 528 528 ========================================== + Hits 2475 2479 +4 + Misses 600 598 -2 + Partials 77 75 -2 ``` [see 2 files with indirect coverage changes](https://app.codecov.io/gh/scrapinghub/spidermon/pull/414/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=scrapinghub)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

VMRuiz commented 1 year ago

The error raised in these lines https://github.com/scrapinghub/spidermon/pull/414/files#diff-877bfbf0203e928b3c50d4dadb782369861cf83240c9d0721ed1f7dd791a6406R140-R145 may be a bug.

Spidermon should allow setting the default value in classes that supports using additional fields like dicts but that doesn't support __setitem__ method. I can fix it by using setattr(item, self.errors_field, defaultdict(list) instead of item[self.errors_field] = defaultdict(list). However I prefer to merge this first and fix it in a new PR.

Also, it should check that data[self.errors_field] is not None to cover the cases where the field was defined but was given a None value as default.