public-transport / gtfs-via-postgres

Process GTFS Static/Schedule by importing it into a PostgreSQL database.
https://github.com/derhuerst/gtfs-via-postgres#gtfs-via-postgres
Other
88 stars 17 forks source link

Dependency on (optional) calendar_dates file #18

Closed brianlove closed 2 years ago

brianlove commented 2 years ago

According to the GTFS spec, calendar_dates.txt is only required if calendar.txt is omitted. However, it appears that gtfs-via-postgres is assuming that this file will always be included.

When I tried to ingest the 6 October Amtrak data feed, the output stated ERROR: relation "service_days" does not exist:

...
CREATE TABLE
transfers
COPY 39729
CREATE INDEX
CREATE INDEX
UPDATE 39729
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
ERROR:  relation "service_days" does not exist
LINE 52:  JOIN service_days ON trips.service_id = service_days.servic...
               ^
...

From the documentation, service_days is created by combining calendar and calendar_dates. It seems that right now, if calendar_dates.txt is not available the calendar_dates table is not being created, leading to service_days then not existing.

I was able to work around this issue by creating a stub calendar_dates.txt file (containing simply the header "service_id","date","exception_type"), at which point the ingest process proceeded without the error.

This all suggests that the calendar_dates table should still be created even if calendar_dates.txt is not available.

derhuerst commented 2 years ago

Thanks for reporting!

I think I'll have time tomorrow to fix this.

derhuerst commented 2 years ago

I have published a fix as gtfs-via-postgres@3.0.2. 🎉