Closed XavierPrudent closed 7 years ago
Whoops! Sorry, didn't mean to close this - I'm re-opening.
@XavierPrudent Right now I don't believe this is possible given the existing tool, but I think we'd certainly be interested in getting this added as a feature - pull requests welcome. Is this something your team would be interested in implementing? I'm not sure yet if this is something we'll be directly working on or not, but I'll talk with @jadorno and see. EDIT - this is possible now - see https://github.com/CUTR-at-USF/gtfsrdb/issues/3#issuecomment-289788790, it's now documented in README.
Hello Sean, That would be a good starting point for a project indeed. I will talk to my colleagues this afternoon. I guess one just needs to set up a serveur and loop.
BTW, in the SELECT statement in the github page https://github.com/CUTR-at-USF/gtfsrdb
are you sure these :: should be points?
WHERE stops.stop_id::text = stop_time_updates.stop_id::text
and looking at a sqlLite DB I created using gtfsrdb, there was no table trips. The input data for the creation of the table were a tripUdate and a vehiclePosition gtfs-rt, as produced by the HART-gtfs-rt generator code.
Regards, Xavier
2017-03-27 14:43 GMT-04:00 Sean Barbeau notifications@github.com:
Whoops! Sorry, didn't mean to close this - I'm re-opening.
@XavierPrudent https://github.com/XavierPrudent Right now I don't believe this is possible given the existing tool, but I think we'd certainly be interested in getting this added as a feature - pull requests welcome. Is this something your team would be interested in implementing? I'm not sure yet if this is something we'll be directly working on or not, but I'll talk with @jadorno https://github.com/jadorno and see.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CUTR-at-USF/gtfsrdb/issues/3#issuecomment-289546637, or mute the thread https://github.com/notifications/unsubscribe-auth/AWa_lL6J-21E577gz_o4K9xwvxvKgimoks5rqANPgaJpZM4Mqmaf .
--
Xavier Prudent
Analyste de données - Forage de données - Apprentissage statistique Data Scientist - Data Mining - Machine Learning
Web : www.xavierprudent.com http://www.xavierprudent.com Tel : 06 66 61 19 31 Skype : xavierprudent
I will talk to my colleagues this afternoon. I guess one just needs to set up a serveur and loop.
Yes, I think we'd want to mirror the behavior of gtfsdb. IIRC the current behavior is just to loop and execute an HTTP request every X seconds. To support pulling from archived feeds, I think you'd support an option to pass a file name via the command line instead of a URL. This file would contain a list of filenames (or just a directory locally or online where the PB files are stored), and the tool would then loop through them as fast as possible and process each one and insert into DB.
are you sure these :: should be points?
Hmmm, no, I'm not. It's been a while since I worked on this, and I know the wiki went through several format conversions. I'm guessing that's an error of the format conversion, and shouldn't be there. You're suggesting that we should remove ::text
from the query everywhere it appears? Can you confirm the query works if you do this?
and looking at a sqlLite DB I created using gtfsrdb, there was no table trips.
I believe this assumes that you've loaded the GTFS static zip file data into the same database as the real-time data. IIRC there is a caveat here - when you load GTFS data, it wipes the database. So, you may need to load the GTFS data first, and then archive the GTFS-rt data. It would be good to confirm if this is (or was) an issue, and if it is, fix it.
Hi Sean,
2017-03-27 15:19 GMT-04:00 Sean Barbeau notifications@github.com:
I will talk to my colleagues this afternoon. I guess one just needs to set up a serveur and loop.
Yes, I think we'd want to mirror the behavior of gtfsdb. IIRC the current behavior is just to loop and execute an HTTP request every X seconds. To support pulling from archived feeds, I think you'd support an option to pass a file name via the command line instead of a URL. This file would contain a list of filenames, and the tool would then loop through them as fast as possible and process each one and insert into DB.
I meet a professor of the Polytech Montréal next week on the opportunity to include students. Such a pimp-up of gtfsrdb would be a good starting point. http://www.polymtl.ca/recherche/rc/en/professeurs/details.php?NoProf=190
are you sure these :: should be points?
Hmmm, no, I'm not. It's been a while since I worked on this, and I know the wiki went through several format conversions. I'm guessing that's an error of the format conversion, and shouldn't be there. You're suggesting that we should remove ::text from the query everywhere it appears? Can you confirm the query works if you do this?
the query worked fine when replacing the :: by a . I thought that was an unfortunate copy&paste from some jave
and looking at a sqlLite DB I created using gtfsrdb, there was no table trips.
I believe this assumes that you've loaded the GTFS static zip file data into the same database as the real-time data. IIRC there is a caveat here - when you load GTFS data, it wipes the database. So, you may need to load the GTFS data first, and then archive the GTFS-rt data. It would be good to confirm if this is (or was) an issue, and if it is, fix it.
By GTFS data, you mean static GTFS, right? The options in gtfsrdb https://github.com/CUTR-at-USF/gtfsrdb include only the possibilities to include trip updates, vehicle positions and trip alerts.
You mean creating the DB with gtfsdb, then calling gtfsrdb without the "-c" argument, right?
BTW, the link to gtfsdb could be updated to https://github.com/OpenTransitTools/gtfsdb
regards, Xavier
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CUTR-at-USF/gtfsrdb/issues/3#issuecomment-289556991, or mute the thread https://github.com/notifications/unsubscribe-auth/AWa_lM-QNxaEBf8fuSIquhLIJZcT7c_Wks5rqAvIgaJpZM4Mqmaf .
--
Xavier Prudent
Analyste de données - Forage de données - Apprentissage statistique Data Scientist - Data Mining - Machine Learning
Web : www.xavierprudent.com http://www.xavierprudent.com Tel : 06 66 61 19 31 Skype : xavierprudent
the query worked fine when replacing the :: by a .
Thanks, just fixed this in https://github.com/CUTR-at-USF/gtfsrdb/commit/198938bd31a21e34e7d5cc1d4b53458715304316. Let me know if this doesn't work.
By GTFS data, you mean static GTFS, right? The options in gtfsrdb https://github.com/CUTR-at-USF/gtfsrdb include only the possibilities to include trip updates, vehicle positions and trip alerts.
Yes, static GTFS data would contain the data for the trips
table.
You mean creating the DB with gtfsdb, then calling gtfsrdb without the "-c" argument, right?
Yes, you'd want to exclude -c
so gtfsrdb doesn't wipe the database.
IIRC, though, there was another issue that would prevent you from running gtfsrdb on an ongoing basis to continuously archive GTFS-rt data and also loading in the static GTFS data using gtfsdb as it became available multiple times (e.g., four times a year), but I don't recall exactly what the problem was.
BTW, the link to gtfsdb could be updated to https://github.com/OpenTransitTools/gtfsdb
Thanks for the heads up, just fixed that in https://github.com/CUTR-at-USF/gtfsrdb/commit/ef9326355a3a098c315914bf8e7fc6236d3f6d8b.
Hey guys, just a heads up,
This tool can be used to read files at it's current state. Just run as follows:
python gtfsrdb.py --once -p file://<path-to-file> -d <db-url>
This can easily be wrapped on a bash script that iterates over all the files and just changes the filename. I can add something about this on the README if you'd like.
Lastly, the -c
parameter will not wipe your database. It simply creates tables if they're missing. Nothing more.
Thanks @jadorno! Yes, please go ahead and open PR with update to Readme on file usage. I'll leave this issue open until we update the Readme as a reminder.
Alright, loading data from files using Bash and MySQL is now documented in the README under Example 3 via https://github.com/CUTR-at-USF/gtfsrdb/commit/0980c1668391efbe34b6fa76fc3b1f7e59bed395.
#!/bin/sh
for file in /path/to/files/*;
do
python /path/to/gtfsrdb.py --once -p file://$file -d "mysql://<username>:<password>@<public_database_server_name>/<database_name>" -c
done
Thanks @jadorno!
Dear authors, The gtfsdb package can be fed by either an url to the gtfs, or a list of gtfs files. Is this feature also available in gtfsrdb? If I have a directory with a 2 weeks history of gtfs-rt, can they be inserted into the db using gtfsrdb? Thanks in advance, regards, Xavier Prudent