Closed tommylge closed 2 weeks ago
Cannot reproduce. Printing the first 100 chars with print(response.body[:100])
:
2024-06-19 11:32:50 [scrapy.core.engine] INFO: Spider opened
2024-06-19 11:32:50 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2024-06-19 11:32:50 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2024-06-19 11:32:50 [scrapy-playwright] INFO: Starting download handler
2024-06-19 11:32:50 [scrapy-playwright] INFO: Starting download handler
2024-06-19 11:32:55 [scrapy-playwright] INFO: Launching browser chromium
2024-06-19 11:32:56 [scrapy-playwright] INFO: Browser chromium launched
2024-06-19 11:32:56 [scrapy-playwright] DEBUG: Browser context started: 'default' (persistent=False, remote=False)
2024-06-19 11:32:56 [scrapy-playwright] DEBUG: [Context=default] New page created, page count is 1 (1 for all contexts)
2024-06-19 11:32:56 [scrapy-playwright] DEBUG: [Context=default] Request: <GET https://....pdf> (resource type: document)
2024-06-19 11:32:56 [scrapy-playwright] DEBUG: [Context=default] Response: <200 https://....pdf>
2024-06-19 11:32:56 [scrapy-playwright] WARNING: Navigating to <GET https://....pdf> returned None, the response will have empty headers and status 200
2024-06-19 11:32:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://....pdf> (referer: None) ['playwright']
b"%PDF-1.3\n%\xe2\xe3\xcf\xd3\n9 0 obj\n<< /Type /Page /Parent 1 0 R /LastModified (D:20200619180943+02'00') /Resourc"
2024-06-19 11:32:56 [scrapy.core.engine] INFO: Closing spider (finished)
2024-06-19 11:32:56 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
$ python -c "import scrapy_playwright; print(scrapy_playwright.__version__)"
0.0.35
$ scrapy version -v
Scrapy : 2.11.2
lxml : 4.9.3.0
libxml2 : 2.10.3
cssselect : 1.2.0
parsel : 1.8.1
w3lib : 2.1.2
Twisted : 24.3.0
Python : 3.10.10 (main, Feb 16 2023, 02:58:25) [Clang 14.0.0 (clang-1400.0.29.202)]
pyOpenSSL : 23.2.0 (OpenSSL 3.1.2 1 Aug 2023)
cryptography : 41.0.3
Platform : macOS-14.4.1-x86_64-i386-64bit
Found this issue on playwright-python repo. https://github.com/microsoft/playwright-python/issues/2408
Appears to come from them. A fix has been released, seems like i don't got the right pw version. Sorry for that and thanks for your answer :)
Got this error, with webkit, firefox and chromium in headless: True.
Code to reproduce: