stac-utils / pystac-client

Python client for searching STAC APIs
https://pystac-client.readthedocs.io
Other
161 stars 48 forks source link

Reduce VCR sizes of some tests #729

Closed gadomski closed 1 month ago

gadomski commented 2 months ago
          I'm a little concerned about the size of this diff. Do you think we need to change some of the tests. In particular I am looking at test_client/TestSigning.test_signing.yaml +135k seems like too many lines for a test.

Originally posted by @jsignell in https://github.com/stac-utils/pystac-client/pull/719#pullrequestreview-2241281240

gadomski commented 2 months ago

A quick check indicates that there's three high-side outliers (>= 1M size):

$ du -h tests/cassettes/**/*.yaml | sort -rh | head
1.7M    tests/cassettes/test_client/TestSigning.test_signing.yaml
1.0M    tests/cassettes/test_item_search/TestItemSearch.test_deprecations[get_items-items-True-True].yaml
1.0M    tests/cassettes/test_item_search/TestItemSearch.test_datetime_results.yaml
336K    tests/cassettes/test_cli/TestCLISearch.test_filter[inprocess].yaml
300K    tests/cassettes/test_client/TestAPI.test_links.yaml
172K    tests/cassettes/test_cli/TestCLISearch.test_intersects[inprocess-netherlands_aoi.json].yaml
160K    tests/cassettes/test_cli/TestCLISearch.test_intersects[inprocess-sample-item.json].yaml
124K    tests/cassettes/test_cli/TestCLISearch.test_intersects_despite_warning[inprocess].yaml
100K    tests/cassettes/test_cli/TestCLICollections.test_save[inprocess].yaml
100K    tests/cassettes/test_cli/TestCLICollections.test_collections[inprocess].yaml

I'll clean those three up and then stop there, unless someone thinks I should continue further down the list.

gadomski commented 2 months ago

@jsignell after a bit of digging, I'm a little less concerned about these VCR sizes. test_signing, for example, is big because it's really doing ~11 separate tests in a single function. I could break that function up into 11 separate tests, but what are we gaining?

To be sure we weren't doing anything too weird, I checked out the size of the response for each of the 13 differet responses from that one test -- none of them were over 144K:

import yaml
from yaml import Loader

with open("tests/cassettes/test_client/TestSigning.test_signing.yaml") as f:
    data = yaml.load(f, Loader)

for interaction in data["interactions"]:
    uri = interaction["request"]["uri"]
    body = interaction["response"]["body"]["string"]
    print(f"{uri}: {len(body)}")

Yields:

https://planetarycomputer.microsoft.com/api/stac/v1: 3247
https://planetarycomputer.microsoft.com/api/stac/v1/: 3247
https://planetarycomputer.microsoft.com/api/stac/v1/collections/cil-gdpcir-cc0: 10864
https://planetarycomputer.microsoft.com/api/stac/v1/collections/cil-gdpcir-cc0/items/cil-gdpcir-INM-INM-CM5-0-ssp585-r1i1p1f1-day: 1931
https://planetarycomputer.microsoft.com/api/stac/v1/collections: 195240
https://planetarycomputer.microsoft.com/api/stac/v1/search: 51136
https://planetarycomputer.microsoft.com/api/stac/v1/search: 51136
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
https://planetarycomputer.microsoft.com/api/stac/v1/search: 144749
jsignell commented 1 month ago

Thanks for looking into it Pete! I buy the argument for just closing this up then.