scrapy / itemloaders

Library to populate items using XPath and CSS with a convenient API
BSD 3-Clause "New" or "Revised" License
45 stars 16 forks source link

The re-introduction of nested item support caused a significant performance degradation #50

Closed Gallaecio closed 10 months ago

Gallaecio commented 2 years ago

I have a CPU-bound Scrapy project that becomes 50% slower after https://github.com/scrapy/itemloaders/pull/29.

I believe the problem is that is_item is not a cheap call, and it will potentially become more expensive as itemadapter extends support to additional types.

I think we may be able to reimplement this function so that it does not call is_item at all, but still does not treat item-like objects as sequences.