google / transitfeed

A Python library for reading, validating, and writing transit schedule information in the GTFS format.
https://github.com/google/transitfeed/wiki
Apache License 2.0
681 stars 254 forks source link

MemoryError #432

Open muyuqiao opened 7 years ago

muyuqiao commented 7 years ago

I saw a similar issue here, but yet no solution found. Strangely after it happened, it happened to my old feeds as well, these were able to validate before.

Also another pc (with same OS, windows 10, same amount of memory, same version of feed validator) works just fine.

The feed file is about 45MB, and number of stop_times lines is more than 7 million. Below is the content of the "transitfeedcrash.txt".

Yikes, the program threw an unexpected exception!

Hopefully a complete report has been saved to transitfeedcrash.txt, though if you are seeing this message we've already disappointed you once today. Please include the report in a new issue at https://github.com/google/transitfeed/issues or an email to the public group transitfeed@googlegroups.com. Sorry!


transitfeed version 1.2.15

File "feedvalidator.py", line 609, in main feed = C:\Temp\transitfeed\gtfs_feed_SGP.zip options = {'check_duplicate_trips': False, 'manual_entry': True, 'extension': None, 'error_types_ignore_list': None, 'memory_db': False, 'service_gap_interval': 13, 'latest_version': '', 'limit_per_type': 5, 'performance': None, 'output': 'validation-results.html'}

File "feedvalidator.py", line 704, in RunValidationFromOptions feed = C:\Temp\transitfeed\gtfs_feed_SGP.zip options = {'check_duplicate_trips': False, 'manual_entry': True, 'extension': None, 'error_types_ignore_list': None, 'memory_db': False, 'service_gap_interval': 13, 'latest_version': '', 'limit_per_type': 5, 'performance': None, 'output': 'validation-results.html'}

File "feedvalidator.py", line 507, in RunValidationOutputFromOptions feed = C:\Temp\transitfeed\gtfs_feed_SGP.zip options = {'check_duplicate_trips': False, 'manual_entry': True, 'extension': None, 'error_types_ignore_list': None, 'memory_db': False, 'service_gap_interval': 13, 'latest_version': '', 'limit_per_type': 5, 'performance': None, 'output': 'validation-results.html'}

File "feedvalidator.py", line 514, in RunValidationOutputToFilename feed = C:\Temp\transitfeed\gtfs_feed_SGP.zip output_file = <open file 'validation-results.html', mode 'w' at 0x02B47498> options = {'check_duplicate_trips': False, 'manual_entry': True, 'extension': None, 'error_types_ignore_list': None, 'memory_db': False, 'service_gap_interval': 13, 'latest_version': '', 'limit_per_type': 5, 'performance': None, 'output': 'validation-results.html'} output_filename = validation-results.html

File "feedvalidator.py", line 532, in RunValidationOutputToFile feed = C:\Temp\transitfeed\gtfs_feed_SGP.zip accumulator = <main.HTMLCountingProblemAccumulator object at 0x02AEC2B0> output_file = <open file 'validation-results.html', mode 'w' at 0x02B47498> problems = <transitfeed.problems.ProblemReporter object at 0x02B5DBB0> options = {'check_duplicate_trips': False, 'manual_entry': True, 'extension': None, 'error_types_ignore_list': None, 'memory_db': False, 'service_gap_interval': 13, 'latest_version': '', 'limit_per_type': 5, 'performance': None, 'output': 'validation-results.html'}

File "feedvalidator.py", line 589, in RunValidation feed = C:\Temp\transitfeed\gtfs_feed_SGP.zip problems = <transitfeed.problems.ProblemReporter object at 0x02B5DBB0> loader = <transitfeed.loader.Loader instance at 0x02B8B788> extension_module = <module 'transitfeed' from 'C:\Temp\transitfeed\library.zip\transitfeed__init__.pyc'> options = {'check_duplicate_trips': False, 'manual_entry': True, 'extension': None, 'error_types_ignore_list': None, 'memory_db': False, 'service_gap_interval': 13, 'latest_version': '', 'limit_per_type': 5, 'performance': None, 'output': 'validation-results.html'} gtfs_factory = <transitfeed.gtfsfactory.GtfsFactory object at 0x02B71110>

File "transitfeed\loader.pyc", line 590, in Load self = <transitfeed.loader.Loader instance at 0x02B8B788>

File "transitfeed\loader.pyc", line 532, in _LoadStopTimes self = <transitfeed.loader.Loader instance at 0x02B8B788> stop_time_class = <class 'transitfeed.stoptime.StopTime'>

File "transitfeed\loader.pyc", line 285, in _ReadCSV deprecated = [] self = <transitfeed.loader.Loader instance at 0x02B8B788> required = ['trip_id', 'arrival_time', 'departure_time', 'stop_id', 'stop_sequence'] cols = ['trip_id', 'arrival_time', 'departure_time', 'stop_id', 'stop_sequence', 'stop_headsign', 'pickup_type', 'drop_off_type', 'shape_dist_traveled', 'timepoint'] file_name = stop_times.txt

File "transitfeed\loader.pyc", line 119, in _GetUtf8Contents file_name = stop_times.txt self = <transitfeed.loader.Loader instance at 0x02B8B788>

File "transitfeed\loader.pyc", line 387, in _FileContents file_name = stop_times.txt self = <transitfeed.loader.Loader instance at 0x02B8B788> results = None

File "zipfile.pyc", line 935, in read pwd = None name = stop_times.txt self = <zipfile.ZipFile object at 0x02B75930>

File "zipfile.pyc", line 630, in read self = <zipfile.ZipExtFile object at 0x0F7B6190> buf = n = -1

File "zipfile.pyc", line 684, in read1 lenreadbuffer = 0 self = <zipfile.ZipExtFile object at 0x0F7B6190> data = œI®ëD E笂)vU¹y3Ä€9Bðˆ¾•Ø=eß{íd‚ž„DÑø¨âÔwâ£rÎüðÛW?|;|ýÇ?üóõOýõÃφo?üöõýýÇýãŸýºÿ?úûŸ~ÿûÃ/ßøß~ÿáëoÿüỆß~øæÇ¿ûê¯ëÿåû¯ûðÕ·?üù×Wýñõ?~úðíGÇcüj||õÉøÕã«O¿úâÓ/¿ê£Ç8<ÚÛc|{<®Á¼–ÇT×Ç0Ÿýôáç¿üõÃÇŸ; ÿÏ(oc;b´² ÓFÿk=alC¹Ã˜ßj=£L}Pï0Ö·±fÆøÚ Æøx›¶sp0ú«‡ùc c|b”a¹Ã˜ÞÊ|Ì·a½Ã(ýTžƒ0–a»ÃèÕ NFÆÇÈü6]ƒ@Ê0ÞY©ãÈrA:a¼³TÇ5õ‚<¶a¼³VÇ-í„ŒmÆ;‹uûù<ÌÃxgµNSc38!uï,שf&õi&Ó0ÞY¯Ó¾<<¸ ýÄÞY°S_× ~A»³b§å­ƒ@ê:LwVì´²>Aæaº³bK¿„œƒ@¦:LwVl™úù<†tÂtgÅ–’™”k&ý„LwVl©ýjv û‰½³bK{k['¤ŸØ;+¶,ýcâøO... nbytes = 46094671 n = 1073741824

MemoryError

matt-wirtz commented 7 years ago

Hi,

yes, I'm getting the same error using transitfeed version 1.2.15. The whole schedule is over 50 MB in size.

The last entries in the transitfeedsrash.txt file are:

File "transitfeed\loader.pyc", line 285, in _ReadCSV deprecated = [] self = <transitfeed.loader.Loader instance at 0x026D3DF0> required = ['trip_id', 'arrival_time', 'departure_time', 'stop_id', 'stop_sequence'] cols = ['trip_id', 'arrival_time', 'departure_time', 'stop_id', 'stop_sequence', 'stop_headsign', 'pickup_type', 'drop_off_type', 'shape_dist_traveled', 'timepoint'] file_name = stop_times.txt

File "transitfeed\loader.pyc", line 119, in _GetUtf8Contents file_name = stop_times.txt self = <transitfeed.loader.Loader instance at 0x026D3DF0>

File "transitfeed\loader.pyc", line 394, in _FileContents data_file = <open file 'gtfs\stop_times.txt', mode 'rb' at 0x027819C0> file_name = stop_times.txt self = <transitfeed.loader.Loader instance at 0x026D3DF0> results = None

MemoryError

jodyswartz commented 6 years ago

I seem to be receiving the same issue as well.