Open michelts opened 4 years ago
Just to note, the exception got from scrapy is:
Traceback (most recent call last):
File ".../lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File ".../lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 38, in process_request
response = yield method(request=request, spider=spider)
File ".../lib/python3.6/site-packages/scrapy_selenium/middlewares.py", line 115, in process_request
request.wait_until
File ".../lib/python3.6/site-packages/selenium/webdriver/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
I am also wondering how to correctly handle the TimeoutException, so I can still parse the page with scrapy even if the content doesn't load.
I have the same issue. In my case I want to "Retry" the request which hit a selenium.common.exceptions.TimeoutException, however that also doesn't seem to work because scrapy doesn't know there was a Timeout so it can't pass the response object to the Retry Middleware.
Hi @clemfromspace
I'm using the
wait_time
andwait_until
to wait for a page to be rendered but, sometimes, the page renders a way I'm not expecting. If I don't use wait_time, I will see the rendered content (if it was faster enough), but using wait time, selenium will trigger a timeout exception and scrapy won't parse the result after all.I wonder if this is something useful somehow, but I'm not sure. I think the approach should be the opposite, I mean, we should handle the exception and still return the found content to scrapy, so I can at least see the snapshot or see the HTML content.