Running out of memory when importing corpus

MichaelGoodale commented 6 years ago

So, when I tried to impore the Spade-ICE-Can corpus, I get an out of memory error when I have about half a gig of RAM left.

ps-worker   | [2018-07-09 18:28:20,613: INFO/ForkPoolWorker-1] Finished loading phone relationships!
ps-worker   | [2018-07-09 18:28:20,614: INFO/ForkPoolWorker-1] Loading phone relationships...
ps-worker   | [2018-07-09 18:30:19,967: ERROR/ForkPoolWorker-1] Task pgdb.tasks.import_corpus_task[5a65e94b-2d24-4bb4-8409-df00755b5b52] raised unexpected: TransientError("There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.",)
ps-worker   | Traceback (most recent call last):
ps-worker   |   File "/site/env/lib/python3.6/site-packages/celery/app/trace.py", line 375, in trace_task
ps-worker   |     R = retval = fun(*args, **kwargs)
ps-worker   |   File "/site/env/lib/python3.6/site-packages/celery/app/trace.py", line 632, in __protected_call__
ps-worker   |     return self.run(*args, **kwargs)
ps-worker   |   File "/site/proj/pgdb/tasks.py", line 9, in import_corpus_task
ps-worker   |     corpus.import_corpus()
ps-worker   |   File "/site/proj/pgdb/models.py", line 528, in import_corpus
ps-worker   |     c.load(parser, self.source_directory)
ps-worker   |   File "/site/proj/PolyglotDB/polyglotdb/corpus/importable.py", line 129, in load
ps-worker   |     could_not_parse = self.load_directory(parser, path)
ps-worker   |   File "/site/proj/PolyglotDB/polyglotdb/corpus/importable.py", line 247, in load_directory
ps-worker   |     self.finalize_import(data, call_back, parser.stop_check)
ps-worker   |   File "/site/proj/PolyglotDB/polyglotdb/corpus/importable.py", line 68, in finalize_import
ps-worker   |     import_csvs(self, data, call_back, stop_check)
ps-worker   |   File "/site/proj/PolyglotDB/polyglotdb/io/importer/from_csv.py", line 196, in import_csvs
ps-worker   |     corpus_context.execute_cypher(s)
ps-worker   |   File "/site/proj/PolyglotDB/polyglotdb/corpus/base.py", line 98, in execute_cypher
ps-worker   |     results = session.run(statement, **parameters)
ps-worker   |   File "/site/env/lib/python3.6/site-packages/neo4j/v1/api.py", line 325, in run
ps-worker   |     self._connection.fetch()
ps-worker   |   File "/site/env/lib/python3.6/site-packages/neo4j/bolt/connection.py", line 290, in fetch
ps-worker   |     return self._fetch()
ps-worker   |   File "/site/env/lib/python3.6/site-packages/neo4j/bolt/connection.py", line 330, in _fetch
ps-worker   |     response.on_failure(summary_metadata or {})
ps-worker   |   File "/site/env/lib/python3.6/site-packages/neo4j/v1/result.py", line 70, in on_failure
ps-worker   |     raise CypherError.hydrate(**metadata)
ps-worker   | neo4j.exceptions.TransientError: There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.