openactive-archive / opportunity-api

Repository for the Opportunity API specification
0 stars 0 forks source link

Filtering #1

Open nickevansuk opened 6 years ago

nickevansuk commented 6 years ago

Requirements

To allow for collections of resources to be filtered within the API, a standard approach to filtering/querying collections should be used.

The approach must be compatible with the PropertyValueSpecification for query string properties.

The approach should cover:

1) The following for numeric and date types:

2) The following for enum types and controlled vocabularies:

3) The following regarding query structure:

4) Boolean values as primitive types, including null values (for undefined properties):

5) Geo search

For example, for both of the endpoints below:

References

Discounted options

A good discussion of available options is available here: https://www.moesif.com/blog/technical/api-design/REST-API-Design-Filtering-Sorting-and-Pagination/

Proposal with Examples

Of all the options investigated, the OpenStack approach appears to be the most user-friendly, as it strikes the best balance between extensibility and familiarity.

The following prefixed operators are allowed: in, nin, neq, gt, gte, lt, and lte e.g. ?size=gt:8. A comma separated list as an operand is synonymous with "in".

lukehesluke commented 6 years ago

@nickevansuk this looks interesting and very powerful

In a much simpler query parameter specification without a syntax like this, every single value would require a new separate query parameter e.g. startDate__gt=2018-... and startDate__lt=2018-... rather than startDate=lt:2018-... and startDate=gt:2018-...

Cons to the much simpler approach:

Pros:


My main concern with the proposal is that it would be difficult to spec comprehensively with widely available tools like Swagger ("type": "string" doesn't really capture that the string can be "2", "gt:2", "in:2,3", etc). It would also be more effort to implement than something simpler and I suspect would leave developers or project managers spending time on deciding if "remainingAttendeeCapacity" needs the "in" operator or not 😝


To me, the qualities of a "simple" spec (simple to spec, simple to implement, simple to test) are:

nickevansuk commented 6 years ago

Totally see where you're coming from @lukehesluke - in the interests of following at least one of the existing conventions, could that be expressed with square brackets instead (e.g startDate[gt]=2018-07-17T13:00:00Z) as per https://www.moesif.com/blog/technical/api-design/REST-API-Design-Filtering-Sorting-and-Pagination/ ?

lukehesluke commented 6 years ago

@nickevansuk sure I'm in favour of that

Some examples to further flesh this out:

lukehesluke commented 6 years ago

Actually should startDate be subEvent.startDate or subEvent[0].startDate?

lukehesluke commented 6 years ago

The square brackets does get in the way of indexing arrays by number

lukehesluke commented 6 years ago

subEvent.0.startDate

subEvent.-1.startDate

nickevansuk commented 6 years ago

This all looks excellent! And agree on #2, for simplicity of having one param (easier to reason about in terms of dev UX too)

One final thing we probably should check, how do different frameworks/languages interpret the square brackets? E.g. what does PHP do with it (i seem to recall it contructs foo[]=1&foo[]=2 into an array).

Assuming that there's no likely implementation issues, then lgtm

nickevansuk commented 6 years ago

sorry just saw your other posts: good point, i guess we're assuming subEvent.startDate is just returning those subEvents that match the startDate (and those Events that contain such subEvents)?

Not sure if we'd need to subEvent[0].startDate, as the ordering of items in an array in the spec is currently not meaningful/important (except for offers, but that's only de-facto not in the spec)?

Could subEvent[].startDate, or just assume it's never going to be a requirement and instead subEvent.startDate

Even with this logic it will be hard to represent "sort by junior price", as that's a property of one of the orders.prices? That's probably niche enough in both implementation and requirement to have its own operator though?

lukehesluke commented 6 years ago

@nickevansuk another remaining question is how we specify items in an array like for subEvent.startDate

lukehesluke commented 6 years ago

Ordering of items in subEvent is not important? I suppose I can see that for where subEvents can't strictly be sorted in time order (e.g. overlaps, multiple subEvents at the same time but with other differences)

lukehesluke commented 6 years ago

Maybe a special operator for this use case like:

As we're trying to reduce the amount of ambiguity / guesswork

See next comment for a better idea

lukehesluke commented 6 years ago

Considering settling upon:

Where the explicit assumption is that, for arrays, the filter passes if any of the items in the array pass the filter

So in the first above case, this filter passes if ANY of the subEvent items have startDate greater than or equal to 2018-... (some datetime)

lukehesluke commented 6 years ago

@nickevansuk a conundrum:

startDate[time-gte]=23:30Z

During British Summer Time, 23:30Z is equivalent to 00:30 (BST). If the consumer of the opportunity API is a British app, making an API call during BST, they are likely filtering for sessions which, on any given day, start at or later than half past midnight local time (i.e. 00:30:00+01:00 -> 23:59:59+01:00). However, it is not clear to the API whether they mean 23:30:00Z -> 23:59:59Z or 23:30:00Z -> 22:59:59Z without knowing the consumer's understanding of local time

Given that this query parameter is going to be used to determine things like "get me all sessions in the morning" or "get me all sessions that start after I leave work", it would seem to me that this query parameter serves no value being timezone-aware and should instead expect a local time

This could be made a little more explicit by renaming the operator to:

startDate[local-time-gte]=00:30
lukehesluke commented 6 years ago

One option is having the consumer always specify the end time

This can be enforced by putting the from and to in the same operator e.g.

startDate[time-range]=23:30Z,23:00Z

The range will now have to be cyclical in order to support non-UTC timezones. The above range would give 23.5 hours

lukehesluke commented 6 years ago

You know what - this still doesn't work 😨

This problem is avoided only if time preferences are specified in local time.

This could either look like:

1.

    subEvent.startDate[local-time-gte]=06:00
    &subEvent.startDate[local-time-lte]=09:00

2.

    subEvent.startDate[time-gte]=06:00,Europe/London
    &subEvent.startDate[time-lte]=09:00,Europe/London

My personal preference is with option 1

civsiv commented 6 years ago

Number 1 for me too, timezone names are ambiguous and so best to just remove that ambiguity

nickevansuk commented 6 years ago

How does one know what local time is from an API call?

lukehesluke commented 6 years ago

@nickevansuk when you say "one", I'm going to assume you're referring to the API consumer.

Local time is a property of a place. Any sessions located in Great Britain will have local time Europe/London

nickevansuk commented 6 years ago

Sorry what I mean is subEvent.startDate[local-time-gte]=06:00 - what is 06:00? What is "local-time"? If feeds are presenting data in UTC (the current recommendation), how are we determining what 06:00 is in UTC for the query?

nickevansuk commented 6 years ago

Sorry I now understand, subEvent.startDate[local-time-gte]=06:00 is always considering the startDate in the local time relative to that event. Hence if you searched for an event in the Japan it would return events that are 6pm in Japan, as if you book a holiday to Japan and you are searching for a Yoga class from London, you would still expect results to appear in Japan local time for when you land there.

TLDR;LGTM :)