Closed denny64 closed 5 years ago
@denny64 thanks for the bug report, I didn't try running this on scrapy cloud. Are you using a custom scrapy fork as required? Also I wonder if twisted version does matter here, I used Twisted==18.9.0
@lopuhin Maybe I am not using the custom scrapyard correctly. How should it be imported?
@denny64 it must be installed, imports don't need to be changed. It can be installed with pip install git+https://github.com/lopuhin/scrapy.git@async-def-parse
- this can probably be done without a custom docker image, but it might be required for chrome. This particular error is probably coming from a twisted version mismatch, maybe it's worth trying to pin twisted as well.
Thanks @lopuhin specifying the Twisted version fixed it. I ran into another issue though (only happening on Scrapy Cloud) -->
Traceback (most recent call last):
File "/app/python/lib/python3.6/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/app/python/lib/python3.6/site-packages/twisted/python/failure.py", line 491, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/app/python/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "/app/python/lib/python3.6/site-packages/twisted/internet/defer.py", line 824, in adapt
extracted = result.result()
File "/app/python/lib/python3.6/site-packages/scrapy_pyppeteer/middleware.py", line 37, in process_browser_request
self._browser = await pyppeteer.launch(**self._launch_options)
File "/app/python/lib/python3.6/site-packages/pyppeteer/launcher.py", line 330, in launch
return await Launcher(options, **kwargs).launch()
File "/app/python/lib/python3.6/site-packages/pyppeteer/launcher.py", line 174, in launch
self.browserWSEndpoint = self._get_ws_endpoint()
File "/app/python/lib/python3.6/site-packages/pyppeteer/launcher.py", line 219, in _get_ws_endpoint
self.proc.stdout.read().decode()
pyppeteer.errors.BrowserError: Browser closed unexpectedly:
/scrapinghub/.local/share/pyppeteer/local-chromium/588429/chrome-linux/chrome: error while loading shared libraries: libX11-xcb.so.1: cannot open shared object file: No such file or directory
Have you seen this before?
Nice 👍
This error is likely due to missing system libraries required to run chrome, I'm afraid the only way around this is using a custom base docker image.
@lopuhin Would you happen to have any scrapy demo starter kits that work on scrapy cloud?
@denny64 sorry, I don't have it, that would be a good next step for the project, thanks for bringing this up.
Works fine locally, but getting this error when running the spider on scrapinghub. Any ideas?