wsp-sag / network_wrangler

A Python Library for Managing Travel Model Network Scenarios
https://wsp-sag.github.io/network_wrangler/
Apache License 2.0
13 stars 3 forks source link

Missing stop references #72

Closed josiekre closed 5 years ago

josiekre commented 5 years ago

I think the simplified GTFS format is missing a link to stop_id in stops.txt. At example/stpaul/... currently, I cannot see how to link stops from stops.txt to trips.txt since stop_times.txt is not present.

I can't find this in the GTFS docs, and I could swear there used to be language on it. When frequencies.txt are used, do records in stop_times.txt still exist to lay out the sequence of stop_ids (shapes.txt only has the roads traversed, not necessarily the stops)?

(cc: @e-lo @i-am-sijia)

Note that this is not blocking. It will be important when we start doing shape/stop related project cards.

e-lo commented 5 years ago

Per the GTFS best practices document, it looks like we need a stop_times.txt, @i-am-sijia

i-am-sijia commented 5 years ago

Got it. The current PRD excludes stop_times.txt. I will change it to write out stop_times.txt for representative trips only.

i-am-sijia commented 5 years ago

Please check out commit 0830e60

josiekre commented 5 years ago

@i-am-sijia According to the best practices doc that @e-lo referenced, the first arrival_time value for each trip_id should have a value of 00:00:00. Can you update it?

In frequency.txt the new columns look good. The new committed version is missing some rows when compared to what's here in this repo. I will pull in the entire new set of GTFS files so that they are consistent, but first I want to make sure you were expecting them to be different.

Are you pulling the most frequent shape for a route in a given time bin (and therefore only producing one headway with one trip_id per time_bin/route/direction combination)? I ask because I only see one frequency record per trip_id, which is not normal in full-on GTFS.

@e-lo Is the above design piece part of the PRD? It effects the search methods.

e-lo commented 5 years ago

In this:

@e-lo Is the above design piece part of the PRD? It effects the search methods.

Are you referencing this?

only producing one headway with one trip_id per time_bin/route/direction combination

If so, not necessarily part of any PRD that I am aware of. But not sure why you would have more than one headway per time_bin/route/direction combo? What else would it be diversified by? We should be expecting a simplification of a single rate for the whole time period.

josiekre commented 5 years ago

@e-lo Correct that was what I was asking about. Two cases I can think of where this would happen:

i-am-sijia commented 5 years ago

@josiekre

The new committed version is missing some rows when compared to what's here in this repo.

I think this repo was using file from commit c5a12b8, which was before I re-clipped st paul city to avoid LFS commit 838bef7. The new committed version has same # rows as commit 838bef7, I think we are good.

I will pull in the entire new set of GTFS files so that they are consistent, but first I want to make sure you were expecting them to be different.

Yes, please pull in the entire new set from here for st paul city, and here for entire MetC.

Are you pulling the most frequent shape for a route in a given time bin (and therefore only producing one headway with one trip_id per time_bin/route/direction combination)?

Yes.

i-am-sijia commented 5 years ago

@josiekre,

the first arrival_time value for each trip_id should have a value of 00:00:00. Can you update it?

please check out commit 97d1f2c for updates

josiekre commented 5 years ago

@i-am-sijia Looks great!

This issue should close automatically when merged into master.