Closed riordan closed 3 months ago
I'd suggest you to request the ability to use contracts with async callbacks to upstream Scrapy. In the past I've used scrapy-autounit, although I haven't tested it with async callbacks so it might also not work for your case.
Finally, one (maybe hacky) way could be to emulate the scrapy-playwright tests themselves. See for instance make_handler
, which creates a ScrapyPlaywrightDownloadHandler
object that can be used to directly retrieve a response to be processed with your callback to check if the result is what you expect (similar to this test). This is not ideal because you'd be using a private handler method, I don't have any plans to change this but just be advised.
I think this could be optimized by recording responses in HAR the first time (see the record_har_path
& record_har_content
arguments for Browser.new_context
, which you can handle via the PLAYWRIGHT_CONTEXTS
setting), and then routing either the context or the page to use the HAR to serve the responses (should be possible with a page init callback).
For the record, I haven't tried any of this myself, I'm just thinking out loud. Perhaps this is something that could be more easily integrated, I'm open to suggestions.
Thank you, @elacuesta! This is excellent context (and fantastic documentation).
Based on a cursory readthrough of scrapy-autounit, the way it implements callbacks is promising :crossed_fingers:.
Sadly, I wound up abandoning contracts for the above scrapy project; it turns out that when you're using attrs
or dataclasses
based items, the contract behaves as if all fields are being returned :thinking:. Autounit might wind up working out better overall.
Closing for now, but will update once I've tried autounit on the playwright spiders.
I'm working on a project with a large number of spiders and a fairly sizable community of open contributors. To ensure that things are working correctly over the long-term, I like the idea of using the Contract conventions on our spiders.
That said, with scrapy-playwright, most calls to the browser benefit from being async, and contracts and so I'm encountering quite a few
TypeError: Contracts don't support async callbacks
.Any thoughts on good testing practices for playwright spiders?