openmobilityfoundation / curb-data-specification

A data specification to help cities manage their curb zone programs and surrounding areas, and measure the utilization and impact.
https://www.openmobilityfoundation.org/about-cds/
Other
47 stars 18 forks source link

Publishing events via a "push" vs "pull" mechanism #99

Open michael-danko-passport opened 2 years ago

michael-danko-passport commented 2 years ago

Is your feature request related to a problem? Please describe.

Currently the CDS Events API defines a query-able API to search for events that have occurred. Since this API is defined as a GET request, consumers of the API will need to parse through the data and compare the event's event_id to those already ingested and deduplicate any events that have already been received.

With a high volume of events occurring at the curb within a city, this could become a computationally intensive process to iterate through the array of events and identify which events are the new events that need to be ingested.

Describe the solution you'd like

A method to receive the data defined in our events as a "push" where the source system that creates the event can initiate the integration and send the data to a receiving API endpoint. Events could still be provided in batches to support any preprocessing of data.

Is this a breaking change

This would not be a breaking change since it is a net-new addition to the specification.

Impacted Spec

For which spec is this feature being requested?

Describe alternatives you've considered

I believe the only alternative is the existing API endpoint, events that provides a method to query the data as a consuming system. In my experience, this can be difficult to scale as consuming applications will consistently want the latest data and decreasing the time between queries will increase the load placed on the source system.

michael-danko-passport commented 2 years ago

@schnuerle Here is the issue that I opened following our discussion in today's WGSC. Feel free to add any additional notes with context you can share.

schnuerle commented 2 years ago

In MDS, it has always had complete different APIs for push and pull. Endpoints in Provider for pull and Agency for push. They've always been separate and different in endpoints and fields, but over time they have converged and aligned more. In MDS 2.0 there is a push to align them totally (same endpoints and fields with only difference being push/pull calls) or make them new identical (same fields and data, but keeping them in Provider and Agency).

See details of this work here. The two specs could learn from each other on how to implement this. https://github.com/openmobilityfoundation/mobility-data-specification/issues/759

michael-danko-passport commented 2 years ago

This was discussed in today's CDS WG meeting (7/26). It was highlighted that there could be advantages to supporting both a push based and pull based API within the spec. A push mechanism could better support near-real time data transfers and a pull mechanism would be able to help with the less frequent reconciliation of data and solve for concerns where a push based system intermittently fails to deliver events.

In both formats, the schemas for what is sent should be aligned to help with the interoperability of both formats.

There were also discussions that referenced the Curb Architectural Decisions document on this subject. https://github.com/openmobilityfoundation/curb-data-specification/wiki/Curb-Architectural-Decisions#push-vs-pull

schnuerle commented 2 years ago

I would like to explore this and see how we could keep the same data structures, but have a push and a pull option. We are diving into this in MDS this week, so maybe we can take some learnings and decisions from there to use in CDS.