umts / bojangles

Ruby script for monitoring the PVTA realtime bus departures feed
MIT License
0 stars 0 forks source link

"Service Exception" import assumes that all service types are in `calendar.txt` #67

Closed werebus closed 5 years ago

werebus commented 5 years ago

I'm pretty sure this isn't a problem with the particular GTFS export we're parsing. But technically speaking, a service type doesn't have to be present in calendar.txt in order for it to be present in calendar_dates.txt. This line:

https://github.com/umts/bojangles/blob/755ad521cd7f95656042c82580635455e94ec1b6/lib/models/service_exception.rb#L11

assumes that. In fact, it's actually a totally valid feed to specify every day in calendar_dates.txt and omit calendar.txt.

Again, I don't think this is causing any errors for the PVTA GTFS feed. Just wanted to clarify.

werebus commented 5 years ago

Well...actually

dfaulken commented 5 years ago

Indeed - this project was written by examining the PVTA feed and deducing from it how GTFS 'works'. This approach, as you point out, comes with the risk that if PVTA changes how they make use of the broader GTFS specification (excuse the redundant acronym), the codebase will no longer work.

My recollection is that at one point we discussed the notion of refactoring bojangles to work with any generic GTFS feed. The consensus we came to at the time, though, was that in our experience, almost nobody implements GTFS truly consistently - every authority has some special sauce that they have to sprinkle on the data that comes from whatever software produces their GTFS feed (likely not quite to specification) in order to import it into whatever software consumes their GTFS feed (itself likely not quite to specification). So if in practice a 'generic' GTFS feed is essentially a unicorn, we may as well keep this tool tailored specifically to how the PVTA implements GTFS.

That being said, certainly if we think that this is an instance where PVTA is particularly likely to change how the feed is constructed, then it would be prudent to generalize the code to account for this particular case.

werebus commented 5 years ago

I'm not sure I agree with the characterization. There is a well-documented standard, and as far as I know, the PVTA feed adheres to it — Google consumes it without difficulty, for example. There is also quite a good collection of software that consumes GTFS feeds without difficulty.

However, I just re-read said specification and I was wrong :sheep::

  • Recommended: Use calendar_dates.txt in conjunction with calendar.txt to define exceptions to the default service patterns defined in calendar.txt. If service is generally regular, with a few changes on explicit dates (for instance, to accommodate special event services, or a school schedule), this is a good approach. In this case calendar_dates.service_id is an ID referencing calendar.service_id.
  • Alternate: Omit calendar.txt, and specify each date of service in calendar_dates.txt. This allows for considerable service variation and accommodates service without normal weekly schedules. In this case service_id is an ID.

(emphasis added)

So, if there's a calendar.txt file at all, the IDs in calendar_dates.txt are supposed to only reference IDs in calendar.txt