Closed sign0 closed 9 years ago
What GTFS are you testing with?
ftp://ftp_public_datasets:5Adbe446c7@upload.canaltp.fr/data/fr-idf/prod_FR-IDF-OPEN_NAViTiA_RTNTFS_V1.zip (network.txt = agency.txt)
Do you want my model ? Or perhaps a zip of my fork with data.zip (by your yahoo mail) ?
Well, that's got to be the largest GTFS I've ever seen. ;)
I think I know why the error is happening and have an idea of how to fix it.
As for all the other files in your feed, I'm trying to maintain compatibility with only the GTFS spec. You're always able to create a fork of this library and customize for your custom GTFS.
OK thank you :)
I avoided the problem by upgrading node 0.10.* to 0.12.7 (very soon 4.0^^) with the flag --max-old-space-size=30720 (I have 32GB) - node 0.10 is clamped to maximum 512MB !
If it can help you, the code consumes around 29GB on my stop_times.txt and database postgis is near 4GB.
Perhaps the callbacks ? http://stackoverflow.com/questions/29880741/request-makes-process-out-of-memory-when-downloading-big-media-files
I'm pretty sure it's from the bulk inserter function getting all rows queued up while it synchronously inserts 1000 rows at a time.
I ended up creating a whole new helper library to solve this one: https://github.com/djstroky/db-streamer
Regarding that particular feed, there were some other problem:
Because of all that I didn't get to test loading in the stop_times.txt table, but nonetheless, this fix should solve the original issue.
Processing downloads/gtfs/stop_times.txt FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory Abandon (core dumped)
(database = postgres & stop_times.txt = 1.3 GB)