scrapy-plugins / scrapy-playwright

🎭 Playwright integration for Scrapy
BSD 3-Clause "New" or "Revised" License
911 stars 101 forks source link

Retry request on every error #237

Closed AnaKuzina closed 5 months ago

AnaKuzina commented 8 months ago

Hello everyone.

So I'm trying to write a crawler that uses Scrapy-playwright. In previous project I've used only Scrapy and set RETRY_TIMES = 3. Even if I had no access to the needed resource the spider would try to send request 3 times and only then it would be closed.

Here I've tried the same but it seems it doesn't work. On the first error I get the spider is closing. Can somebody help me please? What should I do to make spider try to request url as many times as I need?

Here some example of my settings.py:

RETRY_ENABLED = True
RETRY_TIMES = 3
DOWNLOAD_TIMEOUT = 60
DOWNLOAD_DELAY = random.uniform(0, 1)

DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"

Thanks in advance!

elacuesta commented 8 months ago

This package does not interfere with the regular retrying mechanism, so requests should be retried as usual. Please share the full logs for further debugging.