openactive / modelling-opportunity-data

OpenActive Modelling Opportunity Data specification
https://www.openactive.io/modelling-opportunity-data/
Other
6 stars 6 forks source link

Discussion: Impact of changing updates to multi-use facilities on RPDE feeds #73

Open ldodds opened 6 years ago

ldodds commented 6 years ago

Whilst discussing the proposal for #62 the community group identified a potential issue with our proposed approach for describing use of Facilities.

Discussion of this issue can be seen in the comments on #62 and in this video of a community group meeting.

The issue can be summarised as follows:

We decided that the data model being proposed covers the simple cases, e.g. single-use (or primarily single-use) facilities.

We also agreed to proceed with this approach and use implementation feedback to guide whether we might want to change the model or recommend an alternative approach.

Alternative approaches might include:

This issue is intended to capture feedback from both publishers and consumers so we can discuss this further when we have real implementation experience.

projectmaestro commented 5 years ago

We (Makesweat) have implemented Facility Booking for Badminton England at their National Badminton Centre. Makesweat provides the CRM, booking flow & payment for these courts and we have found that facility management is a step increase in complexity above any of our 'event' booking flows. We can see why most provider CRMs & booking systems don't go anywhere near facility booking!

The concerns about data explosion are valid; especially as the 2.0 standard treat each booking as discrete. For example, Court 1 might be free all day; to adequately provide options to the customer, you'd have to publish Court 1 9am - 10am... All possible end times ... - 5pm, Court 1 10am - 11am ... 10am - 5pm, etc. Also longer durations will straddle pricing zones such as PEAK and OFF-PEAK, with different prices for different classes of customer (members vs non-members, etc.) Massive amounts of data for a single court if you used the 2.0 spec. It will be highly data intensive and inefficient for consumers to crawl our feed almost continuously on the off-chance that someone has just made a booking and changed the facility outlook. Our servers would very quickly throttle or ban a consumer requesting this kind of volume.

Based on the above, we would be more comfortable to support live updates of a facility rather than publishing a bunch of booking scenarios for facilities; this has its own challenges as it's inherently hard to scale.

We're happy to work with OA and Badminton England (and others) to expand the number of facilities using Makesweat for management. Once we have a critical mass of facilities we'll be pleased to contribute real-life data to OA as well as propose significant changes to the facility spec.

nickevansuk commented 5 years ago

Thanks for your feedback! We have certainly grappled with this concern for a while, however in practice it's actually turned out to be more efficient than any query based approach (and as you point out the alternatives are inherently hard to scale anyway).

The main advantage of the feed-based approach which is worth emphasising is that it can be backed by CDN so the volume of requests inbound to a server is tied to the number of slots and the frequency of those slots being updated rather than to the number of data consumers.

By way of example: Fusion Lifestyle's 101 sites currently generate 67444 Slot records over a 14 day look-ahead window. Of these 9483 were updated yesterday. With a page size of 500, that's 19 pages of updated data in a day (though of course there will be more due to more frequent updates).

Given these pages can be served over CDN the actual volume of requests served by the booking system is actually incredibly small compared with direct API availability queries. More importantly the volume of requests can be controlled by the booking system irrespective of the number of data consumers.

Responding to two of your comments specifically:

It will be highly data intensive and inefficient for consumers to crawl our feed almost continuously on the off-chance that someone has just made a booking and changed the facility outlook.

The feeds are ordered by modified date, so consumers of feeds only receive data that's changed. Although they would crawl continuously (via CDN) every record they receive will be a useful update.

Our servers would very quickly throttle or ban a consumer requesting this kind of volume.

See details on scaling feeds for how to ensure you can control your inbound request volume to prevent this from happening.

Hope this helps!