Closed adrianleh closed 2 years ago
The problem is that each row in a GTFS shapes.txt
file does not have a unique ID; Rather, a shape (including its unique ID) is formed by combining all rows with an equal shape_id
.
Therefore, I cannot make shape_id
unique as long as I import shapes.txt
as-is row-by-row. So I'm not sure how to fix this.
When using the
--trips-without-shape-id
it resolves the error, however does not generate the shape table at all [...].
I cannot reproduce this.
wget -r --no-parent --no-directories --compression auto -R .csv.gz -P vbb-2022-03-25 -N 'https://vbb-gtfs.jannisr.de/2022-03-25/'
gtfs-to-sql -d --trips-without-shape-id vbb-2022-03-25/*.csv | sponge | psql -b
psql -c 'SELECT * FROM (SELECT DISTINCT ON (shape_id) shape_id, COUNT(shape_pt_sequence) OVER (PARTITION BY shape_id) AS nr_of_points FROM shapes) t ORDER BY nr_of_points DESC LIMIT 10'
# shape_id | nr_of_points
# ----------+--------------
# 2172 | 3547
# 2155 | 3539
# 2147 | 3451
# 2162 | 3424
# 2157 | 3423
# 2161 | 3423
# 2169 | 3327
# 11748 | 3283
# 11798 | 3283
# 11516 | 3265
# (10 rows)
I am using the following dataset: https://www.transitchicago.com/downloads/sch_data/google_transit.zip
Note that you might need to manually fix agency.txt to include an agency_id
When using the
--trips-without-shape-id
it resolves the error, however does not generate the shape table at all [...].I cannot reproduce this.
I am using the following dataset: https://www.transitchicago.com/downloads/sch_data/google_transit.zip
Still can't reproduce.
Are you directly piping gtfs-to-sql
into psql
? Because there's a weird (but unrelated) bug, with sponge
in between, it works.
The exact command I am using is export PGHOST=localhost; export PGPORT=5432; export PGUSER={USER}; export PGPASSWORD={PASSWORD}; export PGDATABASE={DB}; npm exec -- gtfs-to-sql --require-dependencies --routes-without-agency-id -s -- gtfs/*.txt | psql -b
(with the dataset above)
Can you try if the error is gone with sponge
in between?
Same issue persists
npm exec -- gtfs-to-sql --require-dependencies --routes-without-agency-id -s -- gtfs/*.txt | sponge GTFS.sql
(set up psql variables)
psql -b -f GTFS.sql
Would it help if I uploaded the resulting file? I'll warn you though it's ~750MB
For reference using latest docker image postgis/postgis
for testing which is postgres 14.2 under the hood
Would it help if I uploaded the resulting file? I'll warn you though it's ~750MB
Maybe. Can you upload it gzipped, or at least zipped?
GTFS.sql.7z.002.gz GTFS.sql.7z.001.gz GTFS.sql.7z.006.gz GTFS.sql.7z.005.gz GTFS.sql.7z.004.gz GTFS.sql.7z.003.gz
Here you go. Since you can't upload files larger than 10MB and are then restricted to file types, I 7zipped into multiple files and then gzipped each of them (7z extensions are not allowed).
So to use it you unzip each file and then extract the part files with 7z
You can use the commands below
7z x GTFS.sql.7z.001.gz
7z x GTFS.sql.7z.002.gz
7z x GTFS.sql.7z.003.gz
7z x GTFS.sql.7z.004.gz
7z x GTFS.sql.7z.005.gz
7z x GTFS.sql.7z.006.gz
7z x GTFS.sql.7z.001
Any updates on this?
a1304a9 changes the constraint in trips
to check if at least one shapes
row with that shape_id
exists.
Fix published as gtfs-via-postgres@4.2.0
! 🎉
When importing a dataset where trips have shapes that are referenced the following generated line the in SQL causes an error
since
shape_id
in the table shapes is defined as regular non-unique TEXTwhich results in the following error upon generation
When using the
--trips-without-shape-id
it resolves the error, however does not generate the shape table at all which is not desirable.