neo4j-contrib / neo4j-etl

Data import from relational databases to Neo4j.
https://neo4j.com/developer/neo4j-etl/
Other
215 stars 46 forks source link

Neo4jETL 1.4.2 import runs forever? #67

Closed innerforce closed 4 years ago

innerforce commented 4 years ago

Neo4j desktop 1.2.4 Neo4j ETL1.4.2 Neo4j DB: 3.5.14, local instance Source DB is postgres on a remote server.

batch import mode, all nodes and relations are kept as default by the wizard displayed. our Postgres database size is small.

the system appears being hanged somewhere after 2020-02-17 15:51. the CPU fan of my macbook is keeping running loud though... please help, thank you :)

---the last lines from debug.log i can get from the terminal of database instance:

2020-02-17 15:51:54.010+0000 INFO [o.n.k.i.a.s.ConstraintIndexCreator] Starting constraint creation: Index( UNIQUE, :label[14](property[0]) ).
2020-02-17 15:51:54.011+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Index population started: [:SourceTable(_row_id_) [provider: {key=native-btree, version=1.0
}]]
2020-02-17 15:51:54.036+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Completed node store scan. Flushing all pending updates.
BatchingMultipleIndexPopulator{activeTasks=0, executor=java.util.concurrent.ThreadPoolExecutor@5d0b7cb3[Running, pool size = 1, active threads = 0, queued tas
ks = 0, completed tasks = 1], batchedUpdates = [0 updates], queuedUpdates = 0}
2020-02-17 15:51:54.064+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Index created. Starting data checks. Index [:SourceTable(_row_id_) [provider: {key=native-b
tree, version=1.0}]] is POPULATING.
2020-02-17 15:51:54.064+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] TIME/PHASE Final: WRITE[totalTime=45ms], FLIP[totalTime=5ms]
2020-02-17 15:51:54.064+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Shutting down executor.
BatchingMultipleIndexPopulator{activeTasks=0, executor=java.util.concurrent.ThreadPoolExecutor@5d0b7cb3[Running, pool size = 2, active threads = 0, queued tasks = 0, completed tasks = 2], batchedUpdates = [], queuedUpdates = 0}
2020-02-17 15:51:54.064+0000 INFO [o.n.k.i.a.s.ConstraintIndexCreator] Constraint Index( UNIQUE, :label[14](property[0]) ) populated, starting verification.
2020-02-17 15:51:54.065+0000 INFO [o.n.k.i.a.s.ConstraintIndexCreator] Constraint Index( UNIQUE, :label[14](property[0]) ) verified.
2020-02-17 15:51:54.069+0000 INFO [o.n.k.i.a.i.IndexingService] Constraint IndexRule[id=99, descriptor=Index( UNIQUE, :label[14](property[0]) ), provider={key=native-btree, version=1.0}, owner=null] is ONLINE.
2020-02-17 15:51:54.078+0000 INFO [o.n.k.i.a.s.ConstraintIndexCreator] Starting constraint creation: Index( UNIQUE, :label[15](property[3]) ).
2020-02-17 15:51:54.079+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Index population started: [:RunPlan(id) [provider: {key=native-btree, version=1.0}]]
2020-02-17 15:51:54.107+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Completed node store scan. Flushing all pending updates.
BatchingMultipleIndexPopulator{activeTasks=0, executor=java.util.concurrent.ThreadPoolExecutor@628b9652[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1], batchedUpdates = [0 updates], queuedUpdates = 0}
2020-02-17 15:51:54.135+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Index created. Starting data checks. Index [:RunPlan(id) [provider: {key=native-btree, version=1.0}]] is POPULATING.
2020-02-17 15:51:54.135+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] TIME/PHASE Final: WRITE[totalTime=49ms], FLIP[totalTime=4ms]
2020-02-17 15:51:54.136+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Shutting down executor.
BatchingMultipleIndexPopulator{activeTasks=0, executor=java.util.concurrent.ThreadPoolExecutor@628b9652[Running, pool size = 2, active threads = 0, queued tasks = 0, completed tasks = 2], batchedUpdates = [], queuedUpdates = 0}
2020-02-17 15:51:54.136+0000 INFO [o.n.k.i.a.s.ConstraintIndexCreator] Constraint Index( UNIQUE, :label[15](property[3]) ) populated, starting verification.
2020-02-17 15:51:54.136+0000 INFO [o.n.k.i.a.s.ConstraintIndexCreator] Constraint Index( UNIQUE, :label[15](property[3]) ) verified.
2020-02-17 15:51:54.141+0000 INFO [o.n.k.i.a.i.IndexingService] Constraint IndexRule[id=106, descriptor=Index( UNIQUE, :label[15](property[3]) ), provider={key=native-btree, version=1.0}, owner=null] is ONLINE.
2020-02-17 16:18:10.158+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by "Scheduled checkpoint for time threshold" @ txId: 89 checkpoint started...
2020-02-17 16:18:10.194+0000 INFO [o.n.k.i.s.c.CountsTracker] Rotated counts store at transaction 89 to [/Users/jia0001h/Library/Application Support/Neo4j Desktop/Application/neo4jDatabases/database-8bf6601d-bdf0-4942-bb3e-de0179af5102/installation-3.5.14/data/databases/graph.db/neostore.counts.db.b], from [/Users/jia0001h/Library/Application Support/Neo4j Desktop/Application/neo4jDatabases/database-8bf6601d-bdf0-4942-bb3e-de0179af5102/installation-3.5.14/data/databases/graph.db/neostore.counts.db.a].
2020-02-17 16:18:10.243+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Checkpoint triggered by "Scheduled checkpoint for time threshold" @ txId: 89 checkpoint completed in 85ms
2020-02-17 16:18:10.245+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] No log version pruned, last checkpoint was made in version 0
2020-02-17 16:30:32.547+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=104, gcTime=2, gcCount=1}
2020-02-17 16:31:36.552+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=105, gcTime=0, gcCount=0}
2020-02-17 16:31:45.854+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=104, gcTime=0, gcCount=0}
2020-02-17 16:32:40.103+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=102, gcTime=0, gcCount=0}
2020-02-17 16:32:57.900+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=103, gcTime=0, gcCount=0}
2020-02-17 16:34:33.585+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=105, gcTime=0, gcCount=0}
2020-02-17 16:35:46.132+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=108, gcTime=0, gcCount=0}
2020-02-17 16:36:32.481+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=109, gcTime=0, gcCount=0}
2020-02-17 16:36:40.760+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=117, gcTime=0, gcCount=0}
2020-02-17 16:36:41.896+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=104, gcTime=0, gcCount=0}
2020-02-17 16:37:56.178+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=100, gcTime=0, gcCount=0}
2020-02-17 16:38:13.396+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=101, gcTime=0, gcCount=0}
2020-02-17 16:38:32.808+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=115, gcTime=0, gcCount=0}
innerforce commented 4 years ago

fyi, the neo4j ETL UI "see logs" displayed messages as below, "Creating relationships of type TASK" as the last logging and no news.

- Running ETL on Neo4j 3.5.14 - ENTERPRISE
- Exporting from RDBMS to CSV...
- CSV directory: /var/folders/_r/1j4mt0bd0ts6hbz_771v5ydxb064kl/T/csv-002
- Writing CSV headers for node NODE_fwrev2.pharostable_ae1f80fa-dfcf-4c53-8eb8-b17cb35f98c2
- Writing CSV headers for node NODE_fwrev2.loadeddelivery_f9f6e2e9-27bd-4b21-b378-30553a1a8d9e
- Writing CSV headers for node NODE_fwrev2.tabledependency_f7deae2d-7c4b-49dc-8dab-bed9c459e514
- Writing CSV headers for node NODE_fwrev2.nexttasklock_748af99e-c8bf-48d6-906e-83339b0c76e4
- Writing CSV headers for node NODE_fwrev2.rawvaultmapping_c2aabffb-eb2a-4cd7-bed0-87ee99ed35cf
- Writing CSV headers for node NODE_fwrev2.configuration_a075ad51-86af-4552-8a75-6cadfd1d65fd
- Writing CSV headers for node NODE_fwrev2.schemaversion_93183e4c-85cf-4dbd-9d10-f8070dd67f89
- Writing CSV headers for node NODE_fwrev2.schedule_d328e8ab-e1e7-4413-b0dd-06ef0e295abc
- Writing CSV headers for node NODE_fwrev2.transformationviewcolumn_3b4f771e-abf0-4ba3-b2de-fae87eabd95d
- Writing CSV headers for node NODE_fwrev2.sourcetablecolumn_9067a325-7078-40cf-9146-7459756608ed
- Writing CSV headers for node NODE_fwrev2.layer_1400b851-1b8c-4b26-8d8e-8f4590abe63a
- Writing CSV headers for node NODE_fwrev2.transformationview_f138c53c-19a6-4d47-a5c7-7f7cbad477b3
- Writing CSV headers for node NODE_fwrev2.source_fcaecec7-3238-4062-810e-e1f7d5d41989
- Writing CSV headers for relationship REL_source_c5617484-370a-4139-b89e-c80d8acb0aad
- Writing CSV headers for relationship REL_runplan_b2fa13ed-c082-4739-b1ea-b38943c39cfe
- Writing CSV headers for relationship REL_sourcetable_bebcf651-4438-4b51-9c60-0febcdfe492d
- Writing CSV headers for node NODE_fwrev2.sourcetable_a65acb1b-ac3a-4f7e-9b61-bb9213af49a3
- Writing CSV headers for relationship REL_layer_ae282286-b6f3-48bf-80b9-f2b058534ca2
- Writing CSV headers for node NODE_fwrev2.runplan_bab214eb-3a88-4982-bc6d-ab757e301c83
- Writing CSV headers for relationship REL_task_99c2b685-ee21-4245-a7c2-c0697158a6f9
- Writing CSV headers for relationship REL_layer_17be2f7e-05e9-43b2-bea2-cd10348daaea
- Writing CSV headers for node NODE_fwrev2.task_48c91312-0ff4-4905-a4a8-3f2635e8355f
Export time: 9.389 (s)
- Creating Neo4j store from CSV...
- Direct driver instance 220309324 created for server address localhost:7687
- Start Importing CSV nodes...
- Creating constraints on nodes...
- Created constraints
- Creating nodes...
- Creating node with label PharoTable
- Creating node with label LoadDelivery
- Creating node with label Schedule
- Creating node with label TableDependency
- Creating node with label NextTaskLock
- Creating node with label Configuration
- Creating node with label RawVaultMap
- Creating node with label SourceTableColumn
- Creating node with label TransformationViewColumn
- Creating node with label SchemaVersion
- Creating node with label Layer
- Creating node with label TransformationView
- Creating node with label Task
- Creating node with label Source
- Creating node with label SourceTable
- Creating node with label RunPlan
- Start Importing CSV relationships...
- Creating relationships of type TASK
conker84 commented 4 years ago

neo4j-etl-ui-1.4.2-release.tar.gz

hi @innerforce can you please try with the file above?

In order to test it you must provide the path where the file it's placed with the file:// prefix in the area highlighted in red:

Schermata 2020-02-13 alle 18 41 23

I look forward to your feedback.

innerforce commented 4 years ago

thank you conker84, i find the problem later. i have to shut down the neo4j db so that the etl tool will run further:)

conker84 commented 4 years ago

Shutting down the db should not be part of the online import process. Are you sure that it worked?

innerforce commented 4 years ago

yes it worked

innerforce commented 4 years ago

i am now doing the same using admin import from commandline, reuse the ETL tool created csv files (as ETL tool does not work with neo4j4.0 and it is mentioned not going to be supported in the future).

I got this error using admin import, any hints? why i am not allowed to have the same value for different columns in a row of a csv (they are not primary keys)?

unexpected error: Duplicate header entries found, first "mycolvalueincsv":string, other (and conflicting) "mycolvalueincsv":string

jexp commented 4 years ago

Did you try it again with a current version 1.5.x of the etl-tool?