Closed ttilberg closed 2 years ago
Can you provide a minimal reproducible example?
For example, the following minimal spider code should work:
from scrapy import Request, Spider
class MinimalSpider(Spider):
name = "minimal"
custom_settings = {
"DOWNLOAD_HANDLERS": {
"http": "scrapy_zyte_api.handler.ScrapyZyteAPIDownloadHandler",
"https": "scrapy_zyte_api.handler.ScrapyZyteAPIDownloadHandler",
},
"ZYTE_API_KEY": "YOUR_API_KEY",
"TWISTED_REACTOR": "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
}
def start_requests(self):
yield Request(
"https://toscrape.com",
meta={"zyte_api": {"httpResponseBody": True}},
)
def parse(self, response):
pass
You should be able to store this code in a file, set your API key, and run that file with scrapy runspider
.
Assuming it works as expected, what do you need to do to make it fail the way you actual code is failing?
Thanks for taking this time. Someone noticed that we had allow_prereleases = true
in our Pipfile, and removing that has cleared whatever dependency issue may have been causing this. The Pipfile.lock
file referenced a handful of things that rolled back, including a major version of aiohttp
and a handful of other things. multidict
actually bumped forward 3 major versions? Unfortunately I can't speak to exactly which dependency did the trick.
If anyone else comes across this issue, there are likely transitive dependencies in conflict. For us, removing allow_prereleases=true
and running pipenv update
did the trick.
Following the notes for the settings file, we are experiencing an issue where the
http
andhttps
handlers are not loading as expected. Generically, we are receiving the exception:The object should be created from async function
.There are log lines mentioning asyncio, and
aiohttp
paths referenced, so it seems like we are successfully loading AsyncIO. Do you have any thoughts on what this could be?Relevant log lines:
When you trace the code, you find that the
http
andhttps
keys are dropped from the downloaders dict after the first exception, and the second exception is raised because the dict no longer has those keys.