MobilityData / gtfs-validator

Canonical GTFS Validator project for schedule (static) files.
https://gtfs-validator.mobilitydata.org/
Apache License 2.0
275 stars 100 forks source link

Implement station usage verification (GTFS rule) #153

Open maximearmstrong opened 4 years ago

maximearmstrong commented 4 years ago

Is your feature request related to a problem? Please describe. A stop and station should be used at least once. This is a GTFS rule implemented in Google Python validator and featured in Google Type Error as TYPE_STATION_UNUSED.

Describe the solution you'd like Actual Google GTFS validator behaviour : verifies a stop from database. Verifies if stop is used, station associated, and if stop is too far from parent station

Describe alternatives you've considered

Additional context Line 87 and 109 in Error support priorities https://docs.google.com/spreadsheets/d/1vqe6wq7ctqk1EhYkgtZ0_TbcQ91vccfs2daSjn20BLE/edit#gid=0

isabelle-dr commented 1 year ago

This is similar to the unused_shape and unused_trip validation rules. Although this is not explicitly mentioned in the spec or best practices, this could be added to this validator if the community sees value in having this check.

bradyhunsaker commented 1 year ago

This request was made in 2020. In 2021 a change was made to warn if any stops do not appear in stop_times: https://github.com/MobilityData/gtfs-validator/pull/960

That seems to be the most important part.

There seem to be two other parts to this request:

I don't see either of those in the set of rules. Are those two desired?

derhuerst commented 1 year ago
  • Verifying that each station is used by at least one stop/platform.
  • Verifying that stops/platforms are not too far from their stations.

[…] Are those two desired?

Yes! 🙋

isabelle-dr commented 1 year ago

I recommend having INFO for this notice since having stations used isn't an official requirement or recommendation from the spec or Best Practices.

Opened issue #1348 to follow up about the severity of three other issues that might also need to be downgraded (or to have the spec amended).

bradyhunsaker commented 1 year ago

I made PR #1355 to check that a station is used by some stop.

The last part of this issue is checking distance between a stop and its parent_station. I don't see any guidelines in the best practices document.

There are some large train stations that have platforms pretty far from one another.

Here are some possible approaches:

Anyone have opinions?

isabelle-dr commented 1 year ago

Hello @bradyhunsaker, thank you for working on this and sorry for the late answer on this issue.

The last part of this issue is checking the distance between a stop and its parent_station.

I believe this is part of issue #154, and there is a proposition for what thresholds to use. Your work on #1355 would close this issue. 🚀

emmambd commented 1 year ago

@isabelle-dr Did https://gtfs-validator.mobilitydata.org/rules.html#platform_without_parent_station close out this issue?

isabelle-dr commented 9 months ago

No, the two checks mentioned in this comment are still relevant. They are part of the Python validator but not this one, and they are both out of spec. I would recommend adding them as INFO.

Here is the logic for these new rules:

Verifying that each station is used by at least one stop/platform.

For each stop_id that has location_type set to 1 (a station):
   If the stop_id is not referenced in any of the stops.parent_station:
         Generate an unused_station notice.

This is along the lines of he unused_shape and unused_trip.

Verifying that stops/platforms are not too far from their stations.

For each stop_id that has location_type set to 1 (a station):
    For each stop_id that has this particular station defined in parent_station:
        Calculate the distance between this stop_id and the station. 
              If the distance is bigger than [threshold], generate a stop_far_from_station notice
isabelle-dr commented 9 months ago

I just noticed we now have unused_parent_station, so this first one is solved.