findie / wikidata-neo4j-importer

Imports WikiData JSON dumps into Neo4j in a meaningful way.
Other
63 stars 8 forks source link

import fails with error on "There is not enough memory to perform the current task." #12

Open kuczera opened 5 years ago

kuczera commented 5 years ago

My settings:

dbms.memory.heap.initial_size=8G dbms.memory.heap.max_size=24G

Generating and linking claims (6) : 837.788ms Started linking :Quantity to :Item Linkied :Quantity to :Item : 15552434.502ms Started linking :GlobeCoordinate to :Item Structure { signature: 127, fields: [ { code: 'Neo.TransientError.General.OutOfMemoryError', message: 'There is not enough memory to perform the current task. Please ' + "try increasing 'dbms.memory.heap.max_size' in the neo4j " + "configuration (normally in 'conf/neo4j.conf' or, if you you " + 'are using Neo4j Desktop, found through the user interface) or ' + 'if you are running an embedded installation increase the heap ' + "by using '-Xmx' command line flag, and then restart the " + 'database.' } ] } exiting in 5 sec

legraphista commented 5 years ago

This is strange. Can you try to run this query manually?

MATCH (q:GlobeCoordinate)
     WHERE q.globe STARTS WITH 'http'
WITH
     q,
     TRIM(SPLIT(q.globe,'/')[-1]) AS itemId
MATCH (e:Entity) WHERE e.id = itemId
MERGE (q)-[:GLOBE_TYPE]->(e)
REMOVE q.globe

if this fails too, could you try to insert a LIMIT 1000 (or another value) between WITH and the second MATCH and run it until there are no more changes committed to the db.

kuczera commented 5 years ago

Without limit the following error comes up: There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.

kuczera commented 5 years ago

Running this Query for several times works: MATCH (q:GlobeCoordinate) WHERE q.globe STARTS WITH 'http' WITH q, TRIM(SPLIT(q.globe,'/')[-1]) AS itemId LIMIT 10000000 MATCH (e:Entity) WHERE e.id = itemId MERGE (q)-[:GLOBE_TYPE]->(e) REMOVE q.globe;

I hope this helps to improve the code or do i have to do it manually every time ?

kuczera commented 5 years ago

sorry for closing ;-)