scrapinghub / scrapy-poet

Page Object pattern for Scrapy
BSD 3-Clause "New" or "Revised" License
119 stars 28 forks source link

Auto-populate fields on Item Overrides #168

Closed BurnzZ closed 8 months ago

BurnzZ commented 1 year ago

Following the https://github.com/scrapinghub/scrapy-poet/pull/164 PR being merged, we can now override items from providers (e.g. from scrapy-zyte-api's ZyteApiProvider):

import attrs
from zyte_common_items import Product
from web_poet import handle_urls, WebPage, field

@handle_urls("example.com")
@attrs.define 
class ProductPage(WebPage[Product]):
    product: Product

    @field
    def name(self) -> str:
        return f"(modified) {self.product.name}"

The common use case for this are fixing or modifying fields. However, if the item in question has a lot of fields, the Page Object code could be convoluted:

@handle_urls("example.com")
@attrs.define 
class ProductPage(WebPage[Product]):
    product: Product

    @field
    def name(self) -> str:
        return f"(modified) {self.product.name}"

    @field
    def brand(self) -> str:
        return self.product.brand

    @field
    def color(self) -> str:
         return self.product.color

    # and so on for the rest of the unmodified fields ...

We should provide/document an easy way to go about this. It'd be great if there are some built-in way to do this such that we don't have to re-iterate all of the unmodified fields with such boilerplate code.

An idea:

kmike commented 1 year ago

It might also belong to zyte-common-items repo.

kmike commented 1 year ago

Or to scrapy-zyte-api :)

Gallaecio commented 8 months ago

Solved by https://github.com/zytedata/zyte-common-items/pull/63.