Bondify / gtfs_functions

Package with useful functions to create geo-spatial visualizations from a GTFS.
MIT License
114 stars 30 forks source link

Cut_gtfs issue - extra straight line segments created on all routes #13

Closed csdiehl closed 9 months ago

csdiehl commented 2 years ago

I'm having an issue with cut_gtfs function, using the GTFS feed for Metrolinx GO Bus. Cut_gtfs appears to produce random straight line segments connecting stops on different routes. See map below

MicrosoftTeams-image (1)

I validated the feed before running, and it did not contain any major issues that were not present in other feeds that gtfs_functions processed successfully. I tried filtering out rail routes, and routes that contained duplicate stop geometries, but this did not change the behaviour of cut_gtfs. The agency's shape_pt_sequence appears to be in the correct order. The issue seems to appear on all routes in the feed.

I attempted to step through the source code one line at a time, with a single route, to debug the issue. It appears that with Route 18, cut_gtfs classifies every shape as a loop because almost all routes cross over themselves at some point. Then, on branching or looping areas, the traversal lines to cut the shapes are being drawn parallel to the main route (as in the photo below), so the segment is not cut. I tried extending the offset for the cut lines but this did not produce significantly different results.

mx issue 2

On Route 18, there are 5 different route variants, each with their own shape, and extra lines are drawn between the different variants. However, it is common for agencies to have different route patterns with different shapes grouped under the same route_id, and this is compliant with the GTFS static spec, so I'm not sure if that's the problem.

Here is the code to reproduce the issue with Route 18 as an example using the feed in the link above.

routes, stops, stop_times, trips, shapes = gtfs.import_gtfs("./GOtransit.zip", busiest_date = True)

routeSet = routes[routes.route_short_name.isin(['18'])]
  tripSet = trips[trips.route_id.isin(routeSet.route_id.unique())]
  stop_timesSet = stop_times[stop_times.trip_id.isin(tripSet.trip_id.unique())]
  stopsSet = stops[stops.stop_id.isin(stop_timesSet.stop_id.unique())]
  shapeSet = shapes[shapes.shape_id.isin(tripSet.shape_id.unique())]
segments_gdf = gtfs.cut_gtfs(stop_timesSet, stopsSet, shapeSet)
segments_gdf.plot(figsize=(20, 20)

Any help would be much appreciated. Thank you!

Bondify commented 1 year ago

please @csdiehl check if the problem persists with the latest version of the package.