Closed proway2 closed 3 months ago
This is a limitation on the Item definition as it don't support adding new fields.
In theory, using slots=False
should work according to the attr dosc, however it didn't work for me.
Alternatively, adding the field _validation
to the class should remove the error, however, you will get _validation
automatically populated to None on each item which is not the best solution either.
On my experience I had to just convert from Attrs class to dict
before spidermon validaton.
@VMRuiz I had a hope that with this PR we wouldn't need this conversion-to-dict pipeline anymore.
That class is just an interface to simplify read and write access to the different types of Item classes. It doesn't add any additional functionality.
I don't know if we could overcharge it to support additional fields not defined on the original item class.
Maybe we can start by capturing the exception and raising a warning instead.
cc. @Gallaecio @rennerocha
Alternatively, adding the field
_validation
to the class should remove the error
I think this is the way to go. You either disable the creation of this _validation
field, if Spidermon allows for that, or you add the field to your item class.
@Gallaecio Shall we close this ticket as wont fix or similar?
Well, I think there might be an action here even if we go with https://github.com/scrapinghub/spidermon/issues/379#issuecomment-1385349452.
_validation
field cannot be disabled without disabling validation itself (I don’t know if that is the case), then that’s something to support._validation
to an item class that does not support defining unknown fields, and the option to disable the generation of that field instead.This can be turn off with SPIDERMON_VALIDATION_ADD_ERRORS_TO_ITEMS
- https://spidermon.readthedocs.io/en/latest/item-validation.html#spidermon-validation-add-errors-to-items
We may expand the docs to explicitly state that if enabled, it will only work on items that have defined the _validation
field or that support adding undefined fields, like dicts
.
Edit: Additionally, SPIDERMON_VALIDATION_ERRORS_FIELD
can be use to replace the _validation
field with any other field name.
This problem was added by https://github.com/scrapinghub/spidermon/pull/358 and it is a backward incompatible change that we didn't inform our users that it was introduced. So we need to do something about it. Probably one new release that indicates that _validation
field will only work if we are dealing with dict
item.
We already have this situation in Scrapy related to files
and files_urls
field for the media pipelines (https://docs.scrapy.org/en/latest/topics/media-pipeline.html#usage-example). So adding the information that we need to specify _validation
should be enough. However, I believe that we should put as a warning, not just in the middle of the text.
Ubuntu 22.04, Python 3.10.9, Spidermon 1.17.1
If an item defined with
attrs
library and this condition https://github.com/scrapinghub/spidermon/blob/master/spidermon/contrib/scrapy/pipelines.py#L140 isTrue
, then it fails here https://github.com/scrapinghub/spidermon/blob/master/spidermon/contrib/scrapy/pipelines.py#L141 with this message: