scrapinghub / scrapy-poet

Page Object pattern for Scrapy
BSD 3-Clause "New" or "Revised" License
119 stars 28 forks source link

scrapy shell doesn't work with InjectionMiddleware #140

Closed wRAR closed 1 year ago

wRAR commented 1 year ago

Currently using scrapy shell to fetch a URL causes the following exception:

  File ".env/lib/python3.10/site-packages/scrapy/shell.py", line 119, in fetch
    response, spider = threads.blockingCallFromThread(
  File ".env/lib/python3.10/site-packages/twisted/internet/threads.py", line 120, in blockingCallFromThread
    result.raiseException()
  File ".env/lib/python3.10/site-packages/twisted/python/failure.py", line 475, in raiseException
    raise self.value.with_traceback(self.tb)
TypeError: issubclass() arg 1 must be a class

This hides the real traceback which points to https://github.com/scrapinghub/scrapy-poet/blob/ee1357fe84465840c7d6ec1b6a0add8e622611a7/scrapy_poet/injection.py#L317. It calls issubclass(first_parameter.annotation, DummyResponse) where first_parameter is the callback first argument, but scrapy shell uses Deferred.callback as the request callback: https://github.com/scrapy/scrapy/blob/98571eb946e24edfe5b520c0478e72b695d09a9d/scrapy/shell.py#L207 and its first argument is of type Union[_DeferredResultT, Failure]. There are probably multiple way to fix this.

wRAR commented 1 year ago

As this is not the first time scrapy shell is broken with scrapy-poet, it makes sense to make some test for it, probably similar to https://github.com/scrapy/scrapy/blob/master/tests/test_command_shell.py but simplified to one or two test cases.