ODOT-PTS / GTFS-ride

GTFS-ride is an open standard for storing and sharing fixed-route transit ridership data.
https://gtfsride.org
Apache License 2.0
49 stars 10 forks source link

Add avg and stdev fields to ridership.txt #11

Open sdrewc opened 6 years ago

sdrewc commented 6 years ago

Recommend including avg and stdev, because they are useful for describing user experience and understanding crowding, capacity, and reliability issues.

e-lo commented 6 years ago

This would be especially useful for when there is not 100% coverage of APC units on a route.

e-lo commented 6 years ago

Suggest that avg. load would also be valid here?

sylvan-sh commented 6 years ago

-1 pending post to GTFS-ride Changes per CHANGES.md.

@sdrewc Thanks for the contribution! I'm excited to see our first potential change go thru the process.

carletop commented 6 years ago

-1 for clarification. We are currently in the process of creating software tools that will be able to produce reports containing averages and standard deviations from the raw ridership data. Would keeping this functionality in tools outside of the ridership standard specification still provide the information you would like to have, or were you envisioning agencies as being able to report only averages instead of actual counts?

e-lo commented 6 years ago

Would keeping this functionality in tools outside of the ridership standard specification still provide the information you would like to have, or were you envisioning agencies as being able to report only averages instead of actual counts?

Many agencies only have a portion of vehicles outfitted with APC devices, so the total number of riders is not an observable "known", but the sample mean is.

jkkeck commented 6 years ago

With larger samples, I prefer percentiles over standard deviation as a way to measure variability. Percentiles represent actual recorded loads. The 80th percentile load in particular is a useful metric for outreach as it represents the single most crowded IB or OB trip a passenger may encounter during a five-day commute week. 90th and 10th percentiles can reveal service, operations or line management issues. When trips are aggregated for analysis purposes, it is also critically important to know how many samples have been collected for each trip. Because not every bus on our fleet has APC equipment, after a new schedule has been implemented, it sometimes takes a month to capture the three trip-samples for each scheduled weekday trip.

e-lo commented 6 years ago

@jkkeck - a standard deviation for the sample would give you the ability to estimate any percentile for the population as long as you can make an assumption about the distribution. Every agency likely has a different standard that they use to evaluate so it would be difficult to select one.

carletop commented 6 years ago

@e-lo @jkkeck Thankyou both for the information and perspective. If you would like to make your support for the proposed change official, please submit a "+1" vote in a new post to this thread, following the official change process in CHANGES.md. @sdrewc It sounds like the change you have suggested has some support and could possibly meet the three +1 vote threshold needed for adoption. However, the two outstanding -1 votes in the thread above would prevent the adoption of the change. I invite you to address the comments of the -1 votes above to possibly remove these holdups. Also, if anyone would like a more in-depth discussion about this issue (or any other GTFS-ride change ideas), I recommend moving the conversation over to the GTFS-ride Changes Google Group. @sdrewc A new thread for this issue at that site would enable @hooversy to change his vote and support the change. We in the project team appreciate your involvement and feedback supporting the standard.

e-lo commented 6 years ago

One issue we've had is that we can't compute avg loads without a combination of knowing the # of trips that were actually run and the ridership. The board_alight provides for detail on how many runs were actually made, but it doesn't seem like there is a good way to do this at the summary-level.

carletop commented 6 years ago

@e-lo If I am understanding your comment right, you are saying that in addition to adding avg. and std. dev. for boardings and alightings as @sdrewc has suggested in this still active pull request, there would also be benefit from adding avg. and std. dev. of loads, and to accompany each of these items an additional field for the sample size all within ridership.txt. Am I understanding this correctly? If this is the case, it sounds like support for @sdrewc's suggested change, but also a new issue or change request. I would suggest either opening a new issue on this GTFS-ride repo for further discussion and to prompt the project team to take the first stab at developing a solution or submitting a new pull request with the suggested changes implemented. This is great feedback. Please keep it coming.

carletop commented 5 years ago

The name and location of the official Google group where pull requests are required to be announced (per CHANGES.md) has changed. The new group is GTFS-ride. @sdrewc, please post an announcement to this group if this change is still desired.