scrapinghub / web-poet

Web scraping Page Objects core library
https://web-poet.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
93 stars 15 forks source link

Returns doesn't work in some subclasses #182

Closed wRAR closed 1 year ago

wRAR commented 1 year ago

Similar to https://github.com/zytedata/zyte-common-items/issues/49, even though the code is different, it's also not recursive and fails on e.g. this:

def test_returns_inheritance() -> None:
    @attrs.define
    class MyItem:
        name: str

    class BasePage(ItemPage[MyItem]):
        @field
        def name(self):
            return "hello"

    MetadataT = TypeVar("MetadataT")

    class HasMetadata(Generic[MetadataT]):
        pass

    class DummyMetadata:
        pass

    class Page(BasePage, HasMetadata[DummyMetadata]):
        pass

    page = Page()
    assert page.item_cls is MyItem
wRAR commented 1 year ago

The issues are different though: the zyte-common-items code looks for HasMetadata and doesn't find it while the web-poet code looks just for some GenericAlias and in this testcase finds the wrong one.

wRAR commented 1 year ago

Not sure what is the correct implementation here as the current code handles all kinds of generic base classes, including ItemPage, WebPage, Returns, Extractor etc. All of them descend from Returns though, so maybe we should check the class with isinstance, not for equality.