OpenTransitTools / gtfsdb

GTFS ORM using SQLAlchemy
Mozilla Public License 2.0
160 stars 45 forks source link

db support -- sqlite: crashing on a feed with utf-8 route names w/python 2.7. Load also taking a long time (forever?) with trimet.org's gtfs data. #18

Open XavierPrudent opened 7 years ago

XavierPrudent commented 7 years ago

Dear authors,

I have run the given example: bin/gtfsdb-load --database_url sqlite:///gtfs.db http://developer.trimet.org/schedule/gtfs.zip

and get a long list of DEBUG outputs which hangs forever at this point:

16:41:00,630 DEBUG [gtfsdb.model.route] Route.load (0 seconds) 16:41:00,646 DEBUG [gtfsdb.model.route] RouteDirection.load (0 seconds) 16:41:01,324 DEBUG [gtfsdb.model.stop] Stop.load (1 seconds) **16:41:02,624 DEBUG [gtfsdb.model.stop_feature] StopFeature.load (1 seconds) 16:41:02,759 DEBUG [gtfsdb.model.transfer] Transfer.load (0 seconds) *****16:42:02,480 DEBUG [gtfsdb.model.shape] Shape.load (60 seconds) /Library/Python/2.7/site-packages/sqlalchemy/sql/sqltypes.py:596: SAWarning: Dialect sqlite+pysqlite does not support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage. 'storage.' % (dialect.name, dialect.driver)) 16:42:04,079 DEBUG [gtfsdb.model.shape] Pattern.load (2 seconds) ****16:42:07,540 DEBUG [gtfsdb.model.trip] Trip.load (3 seconds)

Is there a way to stop it without corrupting the resulting DB?

Besides that, when feeding it with the following open-source gtfs https://www.donneesquebec.ca/recherche/dataset/e82b9141-09d8-4f85-af37-d84937bc2503/resource/b7f43b2a-2557-4e3b-ba12-5a5c6d4de5b1/download/gtfssherbrooke.zip

Traceback (most recent call last): File "bin/gtfsdb-load", line 13, in sys.exit(gtfsdb.scripts.gtfsdb_load()) File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/scripts.py", line 10, in gtfsdb_load database_load(args.file, kwargs) File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/api.py", line 20, in database_load gtfs.load(db, kwargs) File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/model/gtfs.py", line 34, in load cls.load(db, *load_kwargs) File "/Users/lavieestuntoucan/Documents/projets_perso/Start-up/Civilia/tech/STS/GTFS-rt/gtfsdb/gtfsdb/model/base.py", line 141, in load db.engine.execute(table.insert(), records) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 2055, in execute return connection.execute(statement, multiparams, **params) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 945, in execute return meth(self, multiparams, params) File "/Library/Python/2.7/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement compiled_sql, distilled_params File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context context) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception exc_info File "/Library/Python/2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context context) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. [SQL: u'INSERT INTO agency (agency_id, agency_name, agency_url, agency_timezone, agency_lang, agency_phone) VALUES (?, ?, ?, ?, ?, ?)'] [parameters: ('0', 'Soci\xc3\xa9t\xc3\xa9 de Transport de Sherbrooke', 'http://www.sts.qc.ca/', 'America/Montreal', 'FR', '819-564-2687')]

I am quite confused by the "You must not use 8-bit bytestrings", cannot it deal with any text file?

Thanks in advance, regards, Xavier

fpurcell commented 6 years ago

I just tried gtfsdb with the Sherbrooke gtfs, and I do see both problems with sqlite. Things work fine with Postgres. (UTF-8 issues look similar to issues with MS Sql Server ... maybe Python 3 will magically help).

XavierPrudent commented 6 years ago

Hello Frank,

Which gtfs did you use as an input?

Regards,

Xavier

2017-12-04 16:16 GMT-05:00 Frank Purcell notifications@github.com:

I just tried gtfsdb with the Sherbrooke gtfs, and I do see both problems with sqlite. Things work fine with Postgres. (UTF-8 issues look similar to issues with MS Sql Server ... maybe Python 3 will magically help).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/OpenTransitTools/gtfsdb/issues/18#issuecomment-349108158, or mute the thread https://github.com/notifications/unsubscribe-auth/AWa_lEPtaX4ktFZOwGPMGOerGHaT5-3eks5s9GEigaJpZM4MqzkB .

--

Xavier Prudent

Data Scientist - Data Mining - Machine Learning

Web : www.xavierprudent.com http://www.xavierprudent.com Tel (Québec) : (514) 668 76 46 Skype : xavierprudent

fpurcell commented 6 years ago

I used the link above, Xavier.

bin/gtfsdb-load --database_url sqlite:///gtfs.db https://www.donneesquebec.ca/recherche/dataset/e82b9141-09d8-4f85-af37-d84937bc2503/resource/b7f43b2a-2557-4e3b-ba12-5a5c6d4de5b1/download/gtfssherbrooke.zip

sqlalchemy.exc.ProgrammingError: (pysqlite2.dbapi2.ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. [SQL: u'INSERT INTO agency (agency_id, agency_name, agency_url, agency_timezone, agency_lang, agency_phone) VALUES (?, ?, ?, ?, ?, ?)'] [parameters: ('0', 'Soci\xc3\xa9t\xc3\xa9 de Transport de Sherbrooke', 'http://www.sts.qc.ca/', 'America/Montreal', 'FR', '819-564-2687')]

XavierPrudent commented 6 years ago

Hello Frank,

I see, this link has not been updated since the 4th December, as the Sherbrooke Transportation Company changed their gtfs. I have already notified them. In the meanwhile I invite you to use the enclosed gtfs as an input.

Regards, Xavier

2017-12-04 23:46 GMT-05:00 Frank Purcell notifications@github.com:

I used the link above, Xavier.

bin/gtfsdb-load --database_url sqlite:///gtfs.db https://www.donneesquebec.ca/recherche/dataset/e82b9141- 09d8-4f85-af37-d84937bc2503/resource/b7f43b2a-2557-4e3b- ba12-5a5c6d4de5b1/download/gtfssherbrooke.zip

sqlalchemy.exc.ProgrammingError: (pysqlite2.dbapi2.ProgrammingError) You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. [SQL: u'INSERT INTO agency (agency_id, agency_name, agency_url, agency_timezone, agency_lang, agency_phone) VALUES (?, ?, ?, ?, ?, ?)'] [parameters: ('0', 'Soci\xc3\xa9t\xc3\xa9 de Transport de Sherbrooke', ' http://www.sts.qc.ca/', 'America/Montreal', 'FR', '819-564-2687 <(819)%20564-2687>')]

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/OpenTransitTools/gtfsdb/issues/18#issuecomment-349194367, or mute the thread https://github.com/notifications/unsubscribe-auth/AWa_lAYC5ty2AejjeyeFVvcoVLtCbgupks5s9MqtgaJpZM4MqzkB .

--

Xavier Prudent

Data Scientist - Data Mining - Machine Learning

Web : www.xavierprudent.com http://www.xavierprudent.com Tel (Québec) : (514) 668 76 46 Skype : xavierprudent

fpurcell commented 6 years ago

RE: trimet.org GTFS not working with SQLite:

Use the --ignore_blocks flags (will skip creating a roll-up view of trips assigned to a block).

bin/gtfsdb-load --database_url sqlite:///gtfs.db --ignore_blocks http://developer.trimet.org/schedule/gtfs.zip

XavierPrudent commented 6 years ago

Hello Frank,

I will need little more explanation than a simple copy & paste of the error and of the command line.

Xavier

2017-12-05 16:52 GMT-05:00 Frank Purcell notifications@github.com:

RE: trimet.org GTFS not working with SQLite:

Use the --ignore_blocks flags (will skip creating a roll-up view of trips assigned to a block).

bin/gtfsdb-load --database_url sqlite:///gtfs.db --ignore_blocks http://developer.trimet.org/schedule/gtfs.zip

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/OpenTransitTools/gtfsdb/issues/18#issuecomment-349453703, or mute the thread https://github.com/notifications/unsubscribe-auth/AWa_lFx8JfcZfz7rv6wap8A8YqOeV6gOks5s9bsRgaJpZM4MqzkB .

--

Xavier Prudent

Data Scientist - Data Mining - Machine Learning

Web : www.xavierprudent.com http://www.xavierprudent.com Tel (Québec) : (514) 668 76 46 Skype : xavierprudent

XavierPrudent commented 6 years ago

Hello Frank,

I have been informed that the new gtfs has been uploaded, you should not meet any problem by now.

Regards,

Xavier

2017-12-06 3:52 GMT-05:00 Xavier Prudent prudentxavier@gmail.com:

Hello Frank,

I will need little more explanation than a simple copy & paste of the error and of the command line.

Xavier

2017-12-05 16:52 GMT-05:00 Frank Purcell notifications@github.com:

RE: trimet.org GTFS not working with SQLite:

Use the --ignore_blocks flags (will skip creating a roll-up view of trips assigned to a block).

bin/gtfsdb-load --database_url sqlite:///gtfs.db --ignore_blocks http://developer.trimet.org/schedule/gtfs.zip

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/OpenTransitTools/gtfsdb/issues/18#issuecomment-349453703, or mute the thread https://github.com/notifications/unsubscribe-auth/AWa_lFx8JfcZfz7rv6wap8A8YqOeV6gOks5s9bsRgaJpZM4MqzkB .

--

Xavier Prudent

Data Scientist - Data Mining - Machine Learning

Web : www.xavierprudent.com http://www.xavierprudent.com Tel (Québec) : (514) 668 76 46 Skype : xavierprudent

--

Xavier Prudent

Data Scientist - Data Mining - Machine Learning

Web : www.xavierprudent.com http://www.xavierprudent.com Tel (Québec) : (514) 668 76 46 Skype : xavierprudent