Closed whereissean closed 3 years ago
Charles Noling, please explicit here the use-cases you had in mind for the API during the city-services working group call.
@whereissean , it would be great if you could be there during the next city-services working group session on May 14th since we plan to discuss the metrics API in details.
@Retzoh absolutely will be there. Apologies that I couldn't join today.
I wanted to add some examples of what cities are asking for in their reporting requirement from providers. Some of it could be derived from MDS (if we can agree on how to calculate things) and some is outside of MDS (and should be).
Here is Louisville, KY's example from their dockless policy (page 16):
The operator shall provide a monthly report by the end of the first full week of the following month that is in a format acceptable to Metro that includes, but is not be limited to, the following:
- [* trips] Total number of rides for the previous month and total miles ridden.
- [* status_changes] Total number of vehicles in service for the previous month.
- [* trips] Number of rides per vehicle per day.
- [* status_changes/trips] Location and performance of all preferred and designated parking areas.
- [* status_changes] Number of vehicles removed from service
- Operator staffing levels
- Customer Service Cases, including complaints registered
- Vandalism Incidents
- Crash reports (to include injury/fatalities)
- If available to the Operator, an aggregated breakdown of customers by gender and age monthly. Gender must be reported as male, female, and non‐binary. Age must be reported using these eight age groups: under 5, 5‐17, 18‐24, 25‐34, 35‐ 44, 45‐54, 55‐64, 65 and over.
Items with a [* api] (I added) can be derived from the MDS feed but the methodology is not always agreed upon. Other items cannot be derived from MDS.
Interesting ones here that could be part of Metrics and are not in MDS are:
It would be good to collect other examples of this from current city policy documents.
Here are Santa Monica's examples from the Shared Mobility Device Pilot Program Administrative Regulations pages 14-15 (last updated April 2019):
3.16.2 Reporting
Operators must provide accurate weekly summaries to the City describing customer and staff incidents, injuries, system operation, system use, reported complaints, customer service responses, system maintenance, and education and outreach efforts. Reports will be provided to the City in the format defined by the City.
A monthly dynamic cap report must be submitted to the City on the second business day of each month following the program launch to allow the City to assess and potentially adjust fleet deployment quantities.
...
3.16.3 System Reports
Anonymized data reports to the City are required weekly for the following municipal-level data:
(a) Total users in system by month (b) Trip number by day, week and month (c) Detailed, aggregate trip origin/destination information (d) Trip length and time (e) Hourly fleet utilization with trip origin or destination in Santa Monica and within the Downtown area (f) Hourly device quantities within Santa Monica and within the Downtown area
The Mobility Data Collaborative recently published their Data Sharing Glossary and Metrics document, referenced above and which OMF reviewed/contributed to, and should be utilized here.
DC requires 7 additional monthly reports within 10 days of the end of the month. I've included the overarching concepts below but for specific fields please see the document attached. 2020.2.24 Attatchment C 2020 Dockless Permit Reporting V2 .pdf
These include: -Aggregated user data -Aggregated vehicle data -Summary report -Customer Service report (interactions with customers) -Customer summary report (low-income customer plan ridership) -Staging areas -Unmet needs identifying the first location that a user opened the application when searching for a vehicle and did not unlock a vehicle by census block.
Subjects discussed during 2020-05-14 city-services working group call:
Presentation by @whereissean: https://docs.google.com/presentation/d/1bg36oyQhZlBCQb07JCUFyVe97WsCAeRCDeXjaFMXYM8/edit?usp=sharing
Presentations of reports by @dirkdk (Spin): https://docs.google.com/document/d/1qZvmJzoWrnOVZeaubqxOLNVYKzueWQEaU3C7H3kQ1dw/edit?pli=1#
Short-term actions:
@whereissean: add a section about data sensitivity, authentication and reservations against open data.
@whereissean: explicit the compatibility with MDC metrics (@jfh01 as official contact person for MDC).
Use cases:
Technical issues:
For discussion around how to request the time period.
Should it be interval counts w/ a start date and no end date, or start and end date with interval length only? The first is more machine readable and the latter is more human readable and something a data analyst would use.
I think start and end dates are more consistent with MDS and other kinds of APIs where you request data over a time range. And you can specify the interval (minute, hour, day, week, month) over that time range and those values would be returned in an array.
@whereissean would you mind making your slide deck publicly viewable?
Some notes from our WG call yesterday:
See also new issue #569 from folks at Spin to cross reference use cases.
@whereissean would you mind making your slide deck publicly viewable? @johnclary
Sorry, appears that permissions were changed on the document. Until I can resolve, here is a new public version: https://docs.google.com/presentation/d/1rVwGSYb4d8myGSN9VJrDl1AOGtmdbqvAbXL8-a5VA-o/edit?usp=sharing
thanks @whereissean. @schnuerle would you mind adding the Privacy
label to this?
r.e. this bit from the doc:
For the fields that involve special_users, we propose an x number of subcategories like
low_income
,student
orunbanked
😵 this is the first time I'm seeing mention of this in MDS. is this information that is held by providers?
update: will move this discussion to #569
It looks like this spec might support operational use cases in a way that would avoid the need for agencies and providers to exchange telemetry data. I.e, it might be a drop-in replacement for /status_changes
or /trips
.
For example, as an agency, I'd like to query for the number of vehicles in service in x geography during the last hour.
Are there limitations that would prohibit such a use case as the spec is currently proposed? The use case above requires fairly high temporal and spatial resolution, and minimal latency.
I've reviewed the MDC Glossary and the good news is that methodology looks consistent with the proposed MDS metrics. There were a couple of metrics that were not proposed and a number that are not in MDC Glossary. I've added the ones (maximum/minimum average) that were not in the MDS dockless metrics. I also renamed a number of the proposed metrics to try to align.
I also attached a proposed metrics methodology document that discusses how to compute the metrics and compatibility with MDC Glossary definitions. Thanks @joanathan for putting this document together.
@joshuaandrewjohnson1 @jfh01 @schnuerle Please have a look in #487.
We reviewed this issue as part of the second OMF Working Group Steering Committee release Checkpoint. Both WGSCs had some feedback and I'm documenting it here for discussion.
1) Is this statement true to what you are proposing? The entire proposed Metrics API is meant to be published by cities to providers, after cities have ingested MDS data from providers. So the city is doing the data processing.
If so, how much value is this to cities, and will they be able to justify the heavy lift implementing an API for this? Why not just pull CSV reports from a city database and share those with providers like they do now? Does an API provide enough benefit?
If not, can you clarify how a city can use it and how a provider can use it, both in the issue description and the PR details?
2) One use case mentioned in the original description is to that a city could make this endpoint public. It does not seem that making this endpoint public is a good idea, and instead data derived from the API could be published by the city and made public.
3) Maybe just creating a defined methodology that cities (and providers) can use to calculate reports from MDS is enough, vs creating an endpoint?
These questions could be explored with a city survey to gauge interest if needed.
Ah, I completely missed that was being proposed as a city endpoint. As such, it cannot serve as an alternative to consuming raw trip data, and in fact this proposal necessitates adding more attributes to trip records.
That answers my own question.
@johnclary @schnuerle @jfh01 I think there's a misunderstanding here.
The Metrics API is not just for Agencies; it could be implemented by Providers. And the consumers of an Agency implementation of metrics are not necessarily (only) Providers, in fact the main use cases are for city-internal consumption by analytics and visualization tools.
Yes, this could be an alternative to consuming raw trip data, although in the absence of such data, it makes the metrics essentially impossible to verify.
@johnclary @schnuerle @jfh01 I think there's a misunderstanding here.
The Metrics API is not just for Agencies; it could be implemented by Providers. And the consumers of an Agency implementation of metrics are not necessarily (only) Providers, in fact the main use cases are for city-internal consumption by analytics and visualization tools.
Yes, this could be an alternative to consuming raw trip data, although in the absence of such data, it makes the metrics essentially impossible to verify.
That is how I saw the Metrics API as well. It is a standard that can be implemented by Agency or Provider, or even 3rd party Data aggregator. Either with input data from other MDS endpoints, or different sources (like Special Groups data that would only be available to the Provider)
Note that for 1.1.0 we have merged with #582 the new Geography API to the 'dev' branch. Please update this pull request with the latest code, resolve any conflicts, and make references to the Geography API where appropriate, e.g. with UUIDs.
We will be discussing Metrics at this week's Working Group meeting, so if available please come prepared to talk about your latest updates and ideas.
The content of the 2 Metrics pull requests #486 and #487 have been merged to the new [feature-metrics](https://github.com/openmobilityfoundation/mobility-data-specification/tree/feature-metrics/metrics)
feature branch for everyone to review in context with MDS and the new Geography API, and make PRs against.
We will leave this issue open until that branch is ready to be merged to dev
so please continue to leave feedback/ideas here, or on the new feature branch PR #587.
Is your feature request related to a problem? Please describe.
There is currently no standard way to retrieve metrics calculated from MDS data (
provider
oragency
) or to define a standard set of useful MDS-based data aggregations. We have heard from the OMF community that this leads to a number of problems:Describe the solution you'd like
The proposed Metrics API is intended to help users of MDS - both cities, mobility service providers, and third-party ecosystem services - to have a standard way to consistently describe available metrics, and create an extensible interface for querying core MDS metrics and future metrics still to be defined. It should be a framework to describe how different API users and hosts can:
The goal is to be able to define “Metric X” and then ensure that when “X” is calculated by the city, authorized parties, or transportation providers, the result will be identical. For example, while
n
different methods may exist to calculate the utilization of a vehicle or a fleet for a given time range, the Metrics API is intended to ensure that for given methodk
, the same result will be produced regardless of who conducts the calculation, and there is a standard interface for authorized users to receive this data without requiring access to underlying raw data.The Metrics API is intended to be useful for future MDS use cases, best practices and requirements. Particularly notable is that it provides the foundation to implement data anonymization best practices, such as k-anonymity. It also represents an important component needed to enable new MDS policy types and compliance evaluation as well as operations management use cases that can only be achieved by linking MDS metrics and MDS policy.
This proposed specification is not intended to represent a complete data pipeline or analytics service. It is also not meant to define the complete set of MDS metrics, only a useful starting point.
Is this a breaking change
Impacted Spec
agency
provider
Describe alternatives you've considered
It is hoped that this work can be complementary to other projects working to define, develop, and implement metrics services or metrics processing pipelines for MDS data. Much of this proposal was inspired by excellent work done by OMF member cities and SharedStreets with their SharedStreets Mobility Metrics.
This proposal represents work done without full visibility into the efforts of the Mobility Data Collaborative (MDC). We hope to bring the metrics defined in the Metrics Definitions PR to alignment with those MDC describes, once they become public.
Additional context
This specification received initial input from a variety of OMF contributors, representing city transportation departments (LADOT), ecosystem services stakeholders (Blue Systems, Lacuna, Ellis & Associates), and mobility service providers (Bird). We hope it encourages discussion and creation in the OMF on this important subject. A reference implementation of this API is not included at this time, but hopefully will be developed and contributed following additional community feedback.
Specific thanks to @bhandzo and @HenriJ.
Proposal consists of the following PRs Metrics API PR #486 and Metrics Definitions PR #487