Open itcarroll opened 1 month ago
I agree that we should consider alternative function names. My preference would be to use the prefix find_
instead of search_
, but not fussed about it.
I have no strong preference on search v find, but I am curious what reasoning underlies your preference @chuckwondo?
I have no strong preference on search v find, but I am curious what reasoning underlies your preference @chuckwondo?
Aside from it being 2 letters shorter, in previous contexts, I've often seen DB client APIs using find
(or find_by_X
) as a naming convention, so it is anecdotally arguably more consistent with other things. However, search
may be equally widely used, so again, I don't have any overtly strong preference. It's just a mild preference, perhaps more personal than logical.
Thanks for expounding!
I like the idea of aligning with STAC, a while ago Scott suggested that and I think it'll be valuable to avoid cognitive load from users, the one thing I'm afraid is to deprecated existing names. I think we should try to not break the API to the extend possible while encouraging people to use the new conventions. See: https://github.com/nsidc/earthaccess/discussions/221
I know this has caused confusion in the past, even as others at NSIDC have come up to speed on the library, so I fully support updated names here! I like what @itcarroll proposed to align with STAC. We may also want to consider the language most commonly used within the NASA Earthdata ecosystem. You can't have Earthdata Search without "search", for example, so I'd be more keen on using this vs "find". As an aside, this seems like a great use case for #761 too.
Just to clarify, is this the current proposal?
search_datasets
to search_collections
search_data
to search_items
If so, +1 from me.
As an aside, this seems like a great use case for https://github.com/nsidc/earthaccess/issues/761 too.
:rocket: :100:
Just to clarify, is this the current proposal?
* rename `search_datasets` to `search_collections` * rename `search_data` to `search_items`
If so, +1 from me.
+1
one thing I'm afraid is to deprecated existing names. I think we should try to not break the API to the extend possible while encouraging people to use the new conventions
I do worry that having multiple aliases for common features could lead to confusion, as people might think they do different things. I really like having "one correct way". I do believe a long deprecation period would be in order for top-level API things.
We need to probably have deeper discussions about how to communicate around time-until-deprecation. Should we always include a minimum date in our deprecation messages, e.g. DeprecationWarning(" ... Obsoletion will occur no sooner than YYYY-MM-DD.")
?
Related #766
I like the alignment of earthaccess
terminology with STAC. collections already aligns in STAC and NASA-speak. However, as a newbie to STAC lingo, I find the usage of items unclear.
I have no strong preference on search v find, but I am curious what reasoning underlies your preference @chuckwondo?
Aside from it being 2 letters shorter, in previous contexts, I've often seen DB client APIs using
find
(orfind_by_X
) as a naming convention, so it is anecdotally arguably more consistent with other things. However,search
may be equally widely used, so again, I don't have any overtly strong preference. It's just a mild preference, perhaps more personal than logical.
@mfisher87, after looking at the proposed new names again -- search_collections
and search_items
-- I now have an arguably better reason for preferring find_collections
and find_items
: The term search_collections
is arguably ambiguous in terms of the types of "things" it will find. Does it search the available collections to find things within collections, or does it search for collections?
This analogy might be a bit of a stretch, but consider the case of security procedures at a place/event, where people may be subject to a "bag search." In that context, nobody is searching for bags, they are searching within bags (for banned "items"). Thus, the security folks running a search_bags
function don't expect the result to be a "list of bags," but rather a list of "banned items" within given bags.
Thus, I would argue that a "collection search" implemented by a function named search_collections
could easily be misinterpreted to mean a search for items within collections, not a search for collections, or to simply cause someone to wonder which interpretation is correct, if they recognize the ambiguity. Thus, the name find_collections
arguably eliminates such ambiguity by explicitly stating what we expect to find: collections. (Similarly for find_items
.)
I like the alignment of
earthaccess
terminology with STAC. collections already aligns in STAC and NASA-speak. However, as a newbie to STAC lingo, I find the usage of items unclear.
@andypbarrett, I agree that "items" is perhaps too generic for many folks. Although "collections" is perhaps no less generic a term, anecdotally, it may feel more specific to most folks dealing with this information. I don't have any particular preference or suggestion for a better term than "items," but if you have any suggestions, please share so we can "vote" on it here.
Thus, I would argue that a "collection search" implemented by a function named
search_collections
could easily be misinterpreted to mean a search for items within collections, not a search for collections, or to simply cause someone to wonder which interpretation is correct, if they recognize the ambiguity. Thus, the namefind_collections
arguably eliminates such ambiguity by explicitly stating what we expect to find: collections. (Similarly forfind_items
.)
:100: This is an excellent point. I'm on team find now :)
~ @mfisher87 in #769
Since "granules" is not very generic either, an option borrowed from the STAC spec could be "search_collections" vs "search_items".
Seems like a Milesone 1.0 change though ...