Closed gadomski closed 1 month ago
A quick check indicates that there's three high-side outliers (>= 1M size):
$ du -h tests/cassettes/**/*.yaml | sort -rh | head
1.7M tests/cassettes/test_client/TestSigning.test_signing.yaml
1.0M tests/cassettes/test_item_search/TestItemSearch.test_deprecations[get_items-items-True-True].yaml
1.0M tests/cassettes/test_item_search/TestItemSearch.test_datetime_results.yaml
336K tests/cassettes/test_cli/TestCLISearch.test_filter[inprocess].yaml
300K tests/cassettes/test_client/TestAPI.test_links.yaml
172K tests/cassettes/test_cli/TestCLISearch.test_intersects[inprocess-netherlands_aoi.json].yaml
160K tests/cassettes/test_cli/TestCLISearch.test_intersects[inprocess-sample-item.json].yaml
124K tests/cassettes/test_cli/TestCLISearch.test_intersects_despite_warning[inprocess].yaml
100K tests/cassettes/test_cli/TestCLICollections.test_save[inprocess].yaml
100K tests/cassettes/test_cli/TestCLICollections.test_collections[inprocess].yaml
I'll clean those three up and then stop there, unless someone thinks I should continue further down the list.
@jsignell after a bit of digging, I'm a little less concerned about these VCR sizes. test_signing
, for example, is big because it's really doing ~11 separate tests in a single function. I could break that function up into 11 separate tests, but what are we gaining?
To be sure we weren't doing anything too weird, I checked out the size of the response for each of the 13 differet responses from that one test -- none of them were over 144K:
import yaml
from yaml import Loader
with open("tests/cassettes/test_client/TestSigning.test_signing.yaml") as f:
data = yaml.load(f, Loader)
for interaction in data["interactions"]:
uri = interaction["request"]["uri"]
body = interaction["response"]["body"]["string"]
print(f"{uri}: {len(body)}")
Yields:
https://planetarycomputer.microsoft.com/api/stac/v1: 3247
https://planetarycomputer.microsoft.com/api/stac/v1/: 3247
https://planetarycomputer.microsoft.com/api/stac/v1/collections/cil-gdpcir-cc0: 10864
https://planetarycomputer.microsoft.com/api/stac/v1/collections/cil-gdpcir-cc0/items/cil-gdpcir-INM-INM-CM5-0-ssp585-r1i1p1f1-day: 1931
https://planetarycomputer.microsoft.com/api/stac/v1/collections: 195240
https://planetarycomputer.microsoft.com/api/stac/v1/search: 51136
https://planetarycomputer.microsoft.com/api/stac/v1/search: 51136
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
Thanks for looking into it Pete! I buy the argument for just closing this up then.
Originally posted by @jsignell in https://github.com/stac-utils/pystac-client/pull/719#pullrequestreview-2241281240