astropy / pyvo

An Astropy affiliated package providing access to remote data and services of the Virtual Observatory (VO) using Python.
https://pyvo.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
75 stars 52 forks source link

Second thoughts about list_services #521

Closed msdemlei closed 6 months ago

msdemlei commented 7 months ago

This isn't really a bug report or a wishlist item but rather a request to think about better ways. And it's still the aftermath of #505.

So, I think I don't like list_services() API, on grounds that it constructs potentially quite a few services, while it's quite clear that in the end likely only one of them will be used. Service construction, however, may be expensive, potentially retrieving capability documents or perhaps one day even VOSI tables.

Additionally, we may need to pass in sessions (allowing sessions in get_service will be part of the the upcoming global discovery PR), for instance for auth or timeouts. So, list_services would have to replicate that argument, too.

All that doesn't feel good. Hence, before everyone gets used to list_services, can we perhaps reconsider it? Perhaps we can discuss the various capabilities in .describe() and tell people to look there in tne lax=False error message of get_service?

@ManonMarchand, do you have any thoughts?

ManonMarchand commented 7 months ago

I see the issue, but I don't like the idea of giving the information in describe(), that's a text and thus not machine readable.

What about returning the list of services as urls and descriptions without constructing the Service objetcs? I'd find that useful, and we avoid the costly service creation.

bsipocz commented 7 months ago

I milestone this as 1.6 as this should ideally be resolved before #505 ends up in a release

edit: in fact I think it's useful to add a new label category for release blockers: either critical bugs or required changes/prefered clarifications for new API where requiring a resolution before release would have the potential of saving the trouble of deprecation and changing API after release.

msdemlei commented 7 months ago

On Thu, Feb 08, 2024 at 08:06:31AM -0800, Manon Marchand wrote:

I see the issue, but I don't like the idea of giving the information in describe(), that's a text and thus not machine readable.

That's a valid point, but I think the primary adressee of this information for now is a human anyway -- so I have to say I like the idea of listing the capabilities in describe() independently of the list_services issue, and I think in the lax=False message I'd rather point to describe, too.

What about returning the list of services as urls and descriptions without constructing the Service objetcs? I'd find that useful, and we avoid the costly service creation.

Yee...es... We should probably have a close look at Interface before declaring it a first-class API (if you will), but other than that that's probably a reasonable thing to do. People can then pick the interface they want and call its to_service method, with all the session objects they need in there (if necessary, I can cherry-pick the session-passing part from the add-global-discovery branch, but I think that can wait).

ManonMarchand commented 7 months ago

Our use case (not in production yet) was to get all services of a given type for a resource automatically, so not really for humans even if they might want to peek at the result of the method. I can still do that by calling to_service on a pretty list, but not by parsing the output of describe()

Why would we want to look at Interface? I think I missed something here. My idea was that the returned list should not contain Interface instances but a small inner dataclass that could be fed to to_service. Something along:

from dataclasses import dataclass

@dataclass 
class ListServiceItem:
    url: str
    description: str
    servicetype: str

And then either define a to_service for this dataclass or edit the existing to_service to accept a ListServiceItem too, depending on what it less ugly when writing it.

msdemlei commented 7 months ago

On Fri, Feb 09, 2024 at 01:07:10AM -0800, Manon Marchand wrote:

Why would we want to look at Interface? I think I missed something

Aw, it's been a fairly internal class so far, and if it now becomes "officially" exposed to users (i.e., a list_interfaces() method), we should make sure it's properly documented.

That would be, from a quick look: The class docstring is still missing the capability_description attribute, and I think the ("public") attributes should be described a bit better now. And to_service ought to have a docstring.

here. My idea was that the returned list should not contain Interface instances but a small dataclass that could be fed to to_service.

Hm... my gut reaction would be "we already have Interface, which looks like it would work for intra-record discovery; are we sure it's not good enough?" But if you find out it doesn't work well, I'd not stand in the way of some intermediate description.

ManonMarchand commented 7 months ago

I see. I can explore and see what works, but not before next wednesday. Would that work for you?

msdemlei commented 7 months ago

On Fri, Feb 09, 2024 at 05:49:00AM -0800, Manon Marchand wrote:

I see. I can explore and see what works, but not before next wednesday. Would that work for you?

Sure, absolutely. No particular hurry from my side; we just shouldn't teach people to rely on list_services, because that may limit what we can do in XService constructors later. A few days back or forth won't hurt at all.

Thanks!

ManonMarchand commented 6 months ago

I think this one can be closed.