Closed langbein-daniel closed 1 year ago
My
agency.txt
has just one agency whereagency_timezone
is specified as "Europe/Berlin".
Are there other agencies in agency
(which would require you to remove the NOT NULL
contraint first)? Or is there only one agency?
Can you provide the feed, so that I can try to reproduce this?
There is only one agency. The feed can be downloaded here: https://www.vgn.de/opendata/GTFS.zip
This is a bug. Both arrivals_departures
and connections
assume that routes.agency_id
is not NULL
, because it expects to be able to JOIN
agency
on it.
I'm not sure yet what a good fix will look like; I will investigate the performance trade-offs involved with JOIN
ing the first row if agency_id
is NULL
.
For now, a good workaround is to patch routes.agency_id
before importing, e.g. using e.g. qsv/xsv.
@langbein-daniel Do you want to mentioned as a contributor? If you do, how? (This metadata will probably be included in various public archives for a long time.)
I am glad that I could help you find a bug! Many thanks for having a look and fixing it so quickly.
A mention as a contributor would be nice, but please don't make too much effort about it ;)
{
"name" : "Daniel Langbein",
"email" : "daniel@systemli.org",
}
Even if you have added a test case to detect an invalid agency.txt file (with two agencies but no agency_id), a valid agency.txt with just one entry but no agency_id does currently fail. One sees the new error message The GTFS spec allows routes.agency_id to be empty/null only if there is exactly one agency in the feed.
.
I added console.log('NumAgencies: ' + workingState.nrOfRowsByName.get('agency'))
just before the new check https://github.com/public-transport/gtfs-via-postgres/blob/02d307b66cd6fc1768ffe5f8c8a9176f96a3c0c1/lib/routes.js#L299 and it prints:
NumAgencies: undefined
Here is a screenshot of the IntelliJ debugger:
So I think the problem is that agency.txt has not yet been read. This is probably easy to correct, but I don't know this project well enough to suggest a fix.
Note: I tested this with the current git version of this project, again with the same GTFS feed of the VGN.
[…] a valid agency.txt with just one entry but no agency_id does currently fail. One sees the new error message
The GTFS spec allows routes.agency_id to be empty/null only if there is exactly one agency in the feed.
.
On main
(21aad03), I cannot reproduce this problem, except when running with --routes-without-agency-id
.
The latter should of course work too, lib/routes.js
is missing a !opt.routesWithoutAgencyId
check. Will push another fix for this.
I added
console.log('NumAgencies: ' + workingState.nrOfRowsByName.get('agency'))
just before the new check […] and it printsNumAgencies: undefined
. So I think the problem is that agency.txt has not yet been read. […]
This happens because gtfs-via-postgres
determines the order of files to be processed based on each file's dependencies on others (e.g. stop_times
on trips
). But if --routes-without-agency-id
is passed (and opt.routesWithoutAgencyId
is therefore true
), lib/deps.js
does not specify this dependency, causing agency
to be processed after routes
, causing the workingState.nrOfRowsByName
Map
not to have an agency
entry.
I have added the --routes-without-agency-id
flag in e97a3fde1052dc452562afecaea1ebd3bb455eae, but I don't remember why, given that both files agency
and routes
are required by the GTFS spec.
Maybe as a cheap (and half-broken) workaround around the exact use case you've brought up in this Issue? Does this mean that we can remove --routes-without-agency-id
, because there is now a proper logic for empty routes.agency_id
in place?
Sorry for confronting you with yet another issue/pull request message this weekend ^^
I noticed that in my
arrivals_departures
view, there are not_departure
values. The following statement gives 0 results:As written in the README, this field is calculated based on the
date
anddeparture_time
fields. In the code I saw, that theagency_timezone
is used as well (if no stations or stop timezone is given).My
agency.txt
has just one agency whereagency_timezone
is specified as "Europe/Berlin".And I do have
date
anddeparture_time
values in thearrivals_departures
view:Any ideas what could cause this? I am happy to provide more information. I used the latest Docker image if your program together with the
postgis/postgis
image.