lopuhin / scrapy-pyppeteer

Use pyppeteer from a Scrapy spider
60 stars 12 forks source link

type object 'Deferred' has no attribute 'fromFuture' #3

Closed denny64 closed 5 years ago

denny64 commented 5 years ago

Works fine locally, but getting this error when running the spider on scrapinghub. Any ideas?

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
    result = g.send(result)
  File "/app/python/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
    response = yield method(request=request, spider=spider)
  File "/app/python/lib/python3.6/site-packages/scrapy_pyppeteer/middleware.py", line 31, in process_request
    return _aio_as_deferred(self.process_browser_request(request))
  File "/app/python/lib/python3.6/site-packages/scrapy_pyppeteer/middleware.py", line 62, in _aio_as_deferred
    return Deferred.fromFuture(asyncio.ensure_future(f))
AttributeError: type object 'Deferred' has no attribute 'fromFuture'
lopuhin commented 5 years ago

@denny64 thanks for the bug report, I didn't try running this on scrapy cloud. Are you using a custom scrapy fork as required? Also I wonder if twisted version does matter here, I used Twisted==18.9.0

denny64 commented 5 years ago

@lopuhin Maybe I am not using the custom scrapyard correctly. How should it be imported?

lopuhin commented 5 years ago

@denny64 it must be installed, imports don't need to be changed. It can be installed with pip install git+https://github.com/lopuhin/scrapy.git@async-def-parse - this can probably be done without a custom docker image, but it might be required for chrome. This particular error is probably coming from a twisted version mismatch, maybe it's worth trying to pin twisted as well.

denny64 commented 5 years ago

Thanks @lopuhin specifying the Twisted version fixed it. I ran into another issue though (only happening on Scrapy Cloud) -->

Traceback (most recent call last):
  File "/app/python/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/app/python/lib/python3.6/site-packages/twisted/python/failure.py", line 491, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/app/python/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
    response = yield method(request=request, spider=spider)
  File "/app/python/lib/python3.6/site-packages/twisted/internet/defer.py", line 824, in adapt
    extracted = result.result()
  File "/app/python/lib/python3.6/site-packages/scrapy_pyppeteer/middleware.py", line 37, in process_browser_request
    self._browser = await pyppeteer.launch(**self._launch_options)
  File "/app/python/lib/python3.6/site-packages/pyppeteer/launcher.py", line 330, in launch
    return await Launcher(options, **kwargs).launch()
  File "/app/python/lib/python3.6/site-packages/pyppeteer/launcher.py", line 174, in launch
    self.browserWSEndpoint = self._get_ws_endpoint()
  File "/app/python/lib/python3.6/site-packages/pyppeteer/launcher.py", line 219, in _get_ws_endpoint
    self.proc.stdout.read().decode()
pyppeteer.errors.BrowserError: Browser closed unexpectedly:
/scrapinghub/.local/share/pyppeteer/local-chromium/588429/chrome-linux/chrome: error while loading shared libraries: libX11-xcb.so.1: cannot open shared object file: No such file or directory

Have you seen this before?

lopuhin commented 5 years ago

Nice 👍

This error is likely due to missing system libraries required to run chrome, I'm afraid the only way around this is using a custom base docker image.

denny64 commented 5 years ago

@lopuhin Would you happen to have any scrapy demo starter kits that work on scrapy cloud?

lopuhin commented 5 years ago

@denny64 sorry, I don't have it, that would be a good next step for the project, thanks for bringing this up.