stac-utils / stac-fastapi

STAC API implementation with FastAPI.
https://stac-utils.github.io/stac-fastapi/
MIT License
226 stars 99 forks source link

How to tell if search is for single datetime? (Should `str_to_interval` return a single value for that?) #688

Closed avbentem closed 1 month ago

avbentem commented 2 months ago

I'd guess this is a bug, but maybe not, so a question first:

The STAC API - Item Search specs define:

datetime

Single date+time, or a range ('/' separator), formatted to RFC 3339, section 5.6. Use double dots .. for open date ranges.

If only a single date+time is provided, I'd guess this implies one searches for an exact match?

Indeed, DateTimeType allows for a single value too:

DateTimeType = Union[
    datetime,
    Tuple[datetime, datetime],
    Tuple[datetime, None],
    Tuple[None, datetime],
]

All good so far. However, today's str_to_interval always returns a tuple. And that function is used as the converter for the search request:

class BaseSearchGetRequest(APIRequest):
    ...
    datetime: Optional[DateTimeType] = attr.ib(default=None, converter=str_to_interval)
   ...

So when implementing get_search how would one know if a single date+time was requested, or a range with an open end?

The tests only check intervals, so are not giving me the definitive answer. But I'm guessing this is an oversight?

(Of course I can provide a PR if this is indeed a bug.)

jonhealy1 commented 2 months ago

According to this, searching with a single datetime should be allowed.

Screen Shot 2024-05-08 at 4 34 58 PM

avbentem commented 2 months ago

Yes, a single value is allowed, and then I assume that implies one wants to search for an exact match, right? Then I will create a PR to make str_to_interval return a single datetime, not a tuple, for that use case.

On a related note, I noticed that a POST to /search gets me the original string value in datetime, plus additional values for start_date and end_date. And when supplying a single value, start_date ends up being None while the single value goes into end_date. Very much assuming that is wrong too. And tempted to make it work the same as for GET.

The POST payload I get from STAC Browser is {"datetime": "<single date or range>"}. In the search_request argument in post_search today this gets me:

datetime start_date end_date
2012-07-02T00:00:00Z None 2012-07-02 00:00
2012-07-02T00:00:00Z/.. 2012-07-02 00:00 None
../2012-07-03T23:59:59Z None 2012-07-03 00:00

So, what if a PR would include:

avbentem commented 2 months ago

Oh, I now see that the datetime, start_date and end_date of the POST search originates from stac-pydantic (the latter two being calculated properties).

Guess that makes changing POST a bit tedious. Even more: the behavior for a single value (start_date being None, and end_date getting the single value) is quite explicit there:

@property
def start_date(self) -> Optional[dt]:
    values = (self.datetime or "").split("/")
    if len(values) == 1:
        return None
    if values[0] == ".." or values[0] == "":
        return None
    return parse_rfc3339(values[0])

@property
def end_date(self) -> Optional[dt]:
    values = (self.datetime or "").split("/")
    if len(values) == 1:
        return parse_rfc3339(values[0])
    if values[1] == ".." or values[1] == "":
        return None
    return parse_rfc3339(values[1])
jonhealy1 commented 2 months ago

It can be hard to get an exact match on a datetime I guess.

jonhealy1 commented 2 months ago

So basically querying a single datetime gives you everything before that datetime? I wonder if this is documented in the spec anywhere?

avbentem commented 2 months ago

So basically querying a single datetime gives you everything before that datetime?

For POST: yes, if the user's implementation of the abstract post_search relies on those two properties.

For GET, the user's implementation of the abstract get_search is likely the other way around, when getting the tuple and seeing the first item having a value, and the second being None. For GET, there is currently no way to tell apart a single-value "2012-07-02T01:30:00Z" or a open-ended range "2012-07-02T01:30:00Z/..".

For future readers, I'm doing this now in my own code for POST:

def post_search(
    self, search_request: BaseSearchPostRequest, **kwargs
) -> ItemCollection:
    # A `{"datetime": "2012-07-02T01:30:00Z/2012-07-02T13:30:00Z"}`
    # will make stac-pydantic leave `datetime` at its original
    # string value, and add a `start_date` and `end_date` to
    # `search_request`. For `{"datetime": "2012-07-02T01:30:00Z"}`
    # the value goes into `end_date`, leaving `start_date` at None.
    # So, we need to parse the string value ourselves here.
    # See https://github.com/stac-utils/stac-fastapi/issues/688
    if search_request.datetime:
        search_request.datetime = str_to_interval(search_request.datetime)

    return self.get_search(**search_request.model_dump())

That will work for me, if I fix str_to_interval to return a single value if it's given a single date.

I'll create another issue in stac-pydantic to discuss its handling of datetime.

vincentsarago commented 1 month ago

looking at https://docs.ogc.org/is/17-069r4/17-069r4.html#_parameter_datetime it seem that if datetime is only one value, it should only match the items with the identical value