Aims to address #37
CachedTestURL is the workhorse here, it replaces URL (and can be hot swapped in for it) and instead of always making a request, it favors a locally cached copy. These copies could be generated manually, but there is also an environment variable that will tell spatula to fetch them if they are missing.
Right now it takes the same properties as URL, but could be extended to take response text as suggested in #37
(Also, needs to use all properties of request/response in caching.)
This branch also adds two helper methods so people don't have to work with this directly if they don't want to:
cached_page_response(page: Page) -> Page - returns a page where a request has already been made using CachedTestURL -- allowing you to call methods on your page object as if you're inside of do_scrape
cached_page_items(page: Page) -> list[item] - returns the result of do_scrape collected into a list
Aims to address #37
CachedTestURL
is the workhorse here, it replaces URL (and can be hot swapped in for it) and instead of always making a request, it favors a locally cached copy. These copies could be generated manually, but there is also an environment variable that will tell spatula to fetch them if they are missing.Right now it takes the same properties as URL, but could be extended to take response text as suggested in #37 (Also, needs to use all properties of request/response in caching.)
This branch also adds two helper methods so people don't have to work with this directly if they don't want to:
cached_page_response(page: Page) -> Page
- returns a page where a request has already been made usingCachedTestURL
-- allowing you to call methods on your page object as if you're inside ofdo_scrape
cached_page_items(page: Page) -> list[item]
- returns the result ofdo_scrape
collected into a list