Closed vdusek closed 1 month ago
From the template log... it scrapes normally as it should...
... 2024-09-02T21:34:06.9322150Z [title_spider] [INFO] TitleSpider is parsing <200 https://apify.com/run-scrapy-in-cloud>... ({"spider": "<TitleSpider 'title_spider' at 0x7f70c7ab07f0>"}) 2024-09-02T21:34:06.9455173Z [title_spider] [INFO] TitleSpider is parsing <200 https://docs.apify.com/academy/web-scraping-for-beginners>... ({"spider": "<TitleSpider 'title_spider' at 0x7f70c7ab07f0>"}) 2024-09-02T21:34:06.9530763Z [title_spider] [INFO] TitleSpider is parsing <200 https://apify.com/success-stories>... ({"spider": "<TitleSpider 'title_spider' at 0x7f70c7ab07f0>"}) 2024-09-02T21:34:07.0125374Z [title_spider] [INFO] TitleSpider is parsing <200 https://apify.com/templates/ts-crawlee-playwright-chrome>... ({"spider": "<TitleSpider 'title_spider' at 0x7f70c7ab07f0>"}) ...
But when processing https://console.apify.com/robots.txt, it throws an exception in the proxy middleware, which is caught and logged:
https://console.apify.com/robots.txt
2024-09-02T21:34:07.0410494Z [apify] [WARN] ApifyHttpProxyMiddleware: TunnelError occurred for request="<GET https://console.apify.com/robots.txt>", reason="Could not open CONNECT tunnel with proxy proxy.apify.com:8000 [{'status': 403, 'reason': b'Forbidden'}]", skipping...
but then it incorrectly returns the request object here:
if isinstance(exception, TunnelError): Actor.log.warning( f'ApifyHttpProxyMiddleware: TunnelError occurred for request="{request}", ' 'reason="{exception}", skipping...' ) return request
Which causes it to be rescheduled, and we're stuck in a loop.
Also check the https://github.com/apify/actor-templates/pull/288 - where the tests are executed with alpha release from this branch.
From the template log... it scrapes normally as it should...
But when processing
https://console.apify.com/robots.txt
, it throws an exception in the proxy middleware, which is caught and logged:but then it incorrectly returns the request object here:
Which causes it to be rescheduled, and we're stuck in a loop.
Also check the https://github.com/apify/actor-templates/pull/288 - where the tests are executed with alpha release from this branch.