planetlabs / planet-client-python

Python client for Planet APIs
https://planet-sdk-for-python-v2.readthedocs.io/en/latest/
Apache License 2.0
271 stars 91 forks source link

Planet sync client #1049

Open stephenhillier opened 2 months ago

stephenhillier commented 2 months ago

Proposed Changes:

Adds a Planet sync client that has feature parity with the existing async clients.

Diff of User Interface

New behavior:

The Planet client has members data, subscriptions, orders. Users can interact with these APIs without setting up individual clients or a Session.

Example:

from planet import Planet

client = Planet()

for item in client.subscriptions.list_subscriptions():
    print(item)

results = client.data.search(["PSScene"], limit=10)
for item in results:
    print(item.get("id"))

Minor changes from the async clients to the sync client:

At the risk of scope creep, one additional nice-to-have for an initial MVP sync client would be some simple dataclass/attrs based return types instead of dicts. That way it's easier for users to understand what they're getting back in e.g. a search or order.

ischneider commented 2 months ago

Nice work!

I'm definitely a big :+1: on the single Planet class.

I did a small test listing my searches and then concurrently executing each search using both sync and async (threadpool and queue/task approach) and found the performance to be essentially equivalent w/ both...


I'm less keen on duplication of all the documentation and boiler plate in testing, though I understand this is due to function "color".

One approach I just tested was using a metaclass to dynamically generate sync wrappers. The advantage of this is documentation is mirrored and there's much less code. One disadvantage is type hints aren't handled properly - the actual return types are correct but IDE doc shows up incorrectly... Would be happy to share this but I'm not satisfied with the lack of type hint propagation (and maybe this can be addressed?) but otherwise, beyond some trickery, it's simple and it works.


Another approach we might consider is top-level "request objects" [1] - this way the interface to the request (the name of the operation and the arguments) can always be sync and the results are where dispatch can vary. This way the docs for the operation parameters, exceptions, narrative etc are all in one spot and not repeated and the only code variation is essentially the call_sync (as it is now).

So list_searches might be implemented like

search = data.ListSearches(session, limit=500)
# sync
searches = search.get_data()
# async
searches = await search.aget_data()

Advantage here is parity w/ SH API.

Another approach might be to use a request object but require a specific client, e.g.

client = AsyncClient(session)
# list returns an async generator
searches = await client.list(data.ListSearches(limit=500))

The downside is we still have the old client classes, docs, etc. to maintain for some period of time...

[1] see SentinelHubRequest https://sentinelhub-py.readthedocs.io/en/latest/examples/process_request.html

ischneider commented 2 months ago

Another approach - by extending DataClient and overriding every function (as sync and with altered type hints), we get the best of both (can inherit doc strings and get correct type hints) with less boilerplate...

vscode at least makes the overriding part simpler but then there's still maintenance (adding an optional parameter, for example) and the boilerplate of testing though at least w/ some creativity, I'd guess we could reduce this some...

stephenhillier commented 2 months ago

I do like the idea of inheriting the docstrings where possible. If we can give ourselves a headstart on a simple MVP wrapper, with minimal duplication of what's already there, then that sounds good to me.

The reason I chose to start with the explicit approach is because this is a standalone client and it shouldn't necessarily be a violation of DRY merely to reuse the descriptions (IMO). They will probably need to be modified independently of each other as changes are made.

That being said I will see what I can come up with for sharing the docstrings, I should have some time to round off the week to try a couple approaches :+1:

stephenhillier commented 2 months ago

It occurs to me while going down the path of creating shared docstrings, that we might be needlessly coupling the sync and async implementations together in order to save space on the docstrings. Yes each method calls out to the async method, but the implementation of any method can be changed (if needed), optional args added, and async or sync-specific examples can be added to the docstrings. Examples are sorely needed and those can't really be shared.

The one-time port is simple and all the docs and type hints are correct (and the grunt work has been done). Thoughts?