scrapy / itemloaders

Library to populate items using XPath and CSS with a convenient API
BSD 3-Clause "New" or "Revised" License
44 stars 16 forks source link

Allow None values in Itemloaders/Items #40

Open nikchha opened 3 years ago

nikchha commented 3 years ago

Summary

I would like to pass None values to the Itemloader() and store them in an Item(). Right now, None values are discarded and therefore working with Item() does not work properly.

Motivation

Sometimes values are not available on every parsed page and when the Selector returns None, the database pipeline (Postgres) results in an KeyError: 'fieldname'.

I solved this problem by filling in a null String which is later changed to None but this seems like a hacky solution.

nyov commented 3 years ago

Hey, this has been a discussion in the past, as I recall. See https://github.com/scrapy/scrapy/pull/556 Ultimately the decision was for None values to not be kept by itemloader. But you can restore that possibility by using a custom loader like this:

https://github.com/nyov/scrapyext/blob/2dd5e0fc03f8e4b8793b808744d4dd6452e5d5b3/scrapyext/loader.py#L19-L27

Beware, this is old code I have yet to update. All you'll really want is just to remove the following line in the current codebase:

https://github.com/scrapy/itemloaders/blob/951e9edf2e52620db0338a4edb9015352356abc5/itemloaders/__init__.py#L264

Or we could try to overturn the old decision, now that some water has passed under the bridge (evil laugh).

ejulio commented 3 years ago

Indeed, I'm in favor of having a flag or specialized ItemLoader for this behavior. I think it's weird to loader.add_value('field', None) and not have the field in the output. Even though None is the absence of a value, it is still a value itself

nyov commented 3 years ago

I don't even know why that wasn't a consideration then. But that's exactly what we should add, I think. A documented NoneValueItemLoader subclass or a flag ItemLoader(item, nonevalues=True), either, should both work just fine?

arkadybag commented 3 years ago

Any updates according to it?

AmericanY commented 3 years ago

I'm struggling with the same ! any updates???