cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
47 stars 12 forks source link

Ability to segment and analyze GTFS data by "Services" #976

Open evansiroky opened 2 years ago

evansiroky commented 2 years ago

Is your feature request related to a problem? Please describe.

In the transit database, we are tracking "Services". Services are public transport services that one or many organizations provide. Sometimes, a GTFS dataset can have multiple services within it. This is obviously possible with regional feeds, but it can also happen where one transit agency has multiple services (for example SacRT rail and SacRT bus are considered two services that are described in one GTFS dataset). It would be great to be able to analyze these subsets of services in our data pipeline such that it would be possible to analyze the GTFS data associated with a service just like it is possible today to analyze the services described in a whole GTFS dataset today.

Describe the solution you'd like

Stuff that needs to happen to get this working:

Describe alternatives you've considered

Not doing it at all and just using GTFS datasets like we are today, but we probably should do this.

Additional context

This will help us identify any issues occurring with specific parts of transit agencies. Also, the assessment team would then have an easier time assessing service-based questions.

lauriemerrell commented 1 year ago

@evansiroky I believe you have indicated that you don't intend to maintain the route/agency/network selector fields in GTFS service data which I believe means we would not be able to complete this ticket? But support for matching datasets to services is now available in the warehouse via dim_provider_gtfs_data and gtfs_dataset_key on various GTFS data tables.