gee-community / pytest-gee

The pytest plugin for your gee package 🌍
https://pytest-gee.readthedocs.io/
MIT License
2 stars 1 forks source link

Check query before fetching data from server #12

Open fitoprincipe opened 10 months ago

fitoprincipe commented 10 months ago

Problem Commercial user of GEE API get charged for its usage, so every time a test is triggered, it fetches data from server creating cost for the company. Also, it takes time to fetch data, so when trying to run a complete set of tests, it could take long to finish.

Solution Save the query string into a file (write/read functionality can be found in geetools) and, when the test is triggered, first check if the query string is the same, if True then skip it, if False then fetch.

Context The version of GEE python API could be something to consider, since the structure of the query could change from version to version, thus a version check could be implemented. The version should be stored somewhere alongside the query string, so to avoid having to modify the string and for example create a new string/dict with the version code, my proposal is to write it as part of the filename, something like: test_some_method_0-1-384.gee

Probably @tylere could have an opinion on this.

tylere commented 9 months ago

Assuming that the purpose of pytest-gee is to test packages that use the Earth Engine API (either via the Earth Engine Python client library or REST API), you may want to mock responses of the Earth Engine API so the test run without accessing Earth Engine. https://pytest-mock.readthedocs.io/en/latest/ https://changhsinlee.com/pytest-mock/

12rambau commented 8 months ago

Sorry for not answering earlier but the objective is actually to check the Python API response. The problem we've been facing in the past is small changes made by the GEE behavior without notifying developers. The purpose of the lib here is to really asset the result of the server-side computation.

To provide some context, this plugn is a byproduct of the geetools package that create and manipulate server side object. I really need the response to be evaluate to make sur I'm doing the right thing. I actually got an idea from a colleague on a private repo, I'll try to implement it when I have the time (you'll see it's fancy :smile:)

tylere commented 8 months ago

In this case I would suggest setting up two sets of tests: 1) A set of fast, inexpensive tests that are run whenever new changes are made to pytest-gee. These tests should use mock data. 2) A set of slow and/or expensive tests that are run infrequently (for example: daily, weekly, or whenever a new release of pytest-gee is made. These tests would access Earth Engine to detect if the GEE server responses have changed.

12rambau commented 8 months ago

ah sure this will be done in downstream packages (geetools, ipygee, pygaul... maybe more ?)