cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
48 stars 13 forks source link

Come up with a way to mark URLs that are no longer functional #899

Closed evansiroky closed 1 month ago

evansiroky commented 2 years ago

What is the expected behavior? Thinking specifically...

Currently, I would say our data views reflect "latest known feed data", so their last downloaded feed / this url are what come up when you query the latest data. We could support mechanisms for (1) marking an entire feed as deleted, e.g. this gtfs schedule no longer exists / is relevant in any way, and (2) marking URLs that aren't supported , e.g. this data still exists and this was the url it came from, but there is no current url.

Originally posted by @machow in https://github.com/cal-itp/data-infra/issues/494#issuecomment-946144413

machow commented 2 years ago

See also these discussions about the structure of agencies.yml:

lauriemerrell commented 1 year ago

I believe that this need is met by the deprecated_date and related fields in GTFS datasets (available in the warehouse), recommend closing this ticket.