Closed mcplanner-zz closed 3 years ago
@mcplanner This GTFS dataset is a first for me - it's the first time I've seen a mix between the two different calendar models in the same GTFS dataset for the same agency.
I've always viewed the GTFS spec as supporting two calendar models - a normalized Model A that the best practices advocate for (everything in calendar.txt
with exceptions/holidays called out in calendar_dates.txt
, but all trips in trips.txt
referencing calendar.txt
), and a denormalized Model B (where all date instances for service are defined in calendar_dates.txt
and therefore all trips in trips.txt
reference calendar_dates.txt
).
It looks like this dataset has a hybrid of both approaches, with service_id 73343
following the Model A normalized approach (with a few exceptions to 73343
in calendar_dates.txt), and service_id 73344
following the denormalized approach of adding service outside of any pattern defined in calendar.txt
:
service_id,monday,tuesday,wednesday,thursday,friday,saturday,sunday,start_date,end_date
73343,1,1,1,1,1,0,0,20200318,20210101
service_id,date,exception_type
73343,20201225,2
73343,20201127,2
73343,20201126,2
73343,20200907,2
73344,20201225,1
73344,20201127,1
73344,20201126,1
73344,20200907,1
Talking to a few others, though, it seems like this practice is generally accepted as valid GTFS?
The spec isn't clear about this - it would be nice if it was clarified to say this was allowed. It seems odd to me, in that it breaks traditional data modeling practices. In the hybrid scenario, for the table calendar_dates.txt
for some records we treat service_id
as the primary key, and for others we treat service_id
as a foreign key (to calendar.txt
). Typically we'd define the role of a field for primary/foreign keys at the table level, not at the record level.
Trillium built the feed. I wonder if they can comment on why it was constructed that way, and how common this construction is?
That is Trillium's general practice. These exceptions occur on weekdays where regular weekday service is replaced with weekend/holiday service:
Agency website: https://www.sonomamarintrain.org/schedules-fares
This data is accepted as valid by the 3rd party apps that receive Trillium data. I think this is a good way of representing exceptions, because it corresponds to how the agency is representing exceptions to customers (service is replaced on a holiday).
Here is another example of this practice: https://transitfeeds.com/p/arcata-mad-river-transit-system/148/latest/file/calendar_dates.txt
An aside: The publisher of the SMART (Sonoma Marin Area Rail Transit) feed is MTC/511.org but we assume the original data source is indeed Trillium. We last pushed data to 511.org on 08/24/2020.
Thanks @antrim for that explanation. Other datasets that I've seen have an entry in calendar.txt
for the weekend service, and then have an entry in calendar_dates.txt
to add this service on the holiday. But the above makes sense to me if you have one-off service on holidays that don't fit a pattern (i.e., isn't the same as normal weekday service).
This was fixed by https://github.com/MobilityData/gtfs-validator/pull/423
Describe the bug The validator was unable to process a feed and gave an error.
To Reproduce Run the validator on the following dataset: http://transitfeeds.com/p/sonoma-marin-area-rail-transit/1050/latest
Expected behavior No notice should be generated in this case.
Witnessed behavior
Environment used
Additional context The calendar.txt file doesn't use the service_name field, but all other required fields appear to be present.