graphaware / neo4j-uuid

GraphAware Runtime Module that assigns a UUID to all nodes (and relationships) in the graph transparently
103 stars 22 forks source link

Uuid plugins hangs write query after upgrade #55

Closed dmiranda2791 closed 4 years ago

dmiranda2791 commented 5 years ago

We can't create or update new data after we upgraded the the database to v3.5.9 and consequently the plugins.

We are using:

After a log of time of executing a query with a create statement I see this exceptions in my neo4j.log.

If I remove graphaware uuid plugin jar and its configurations it all works.

image

I'd appreciate any help :)

luanne commented 5 years ago

Thanks for reporting this, will look into it

luanne commented 5 years ago

@dmiranda2791 which version of Neo4j did you upgrade from, and was it Enterprise or Community? How did you upgrade? Did you configure the plugins first, start Neo4j, change password, shutdown, copy your database? Or another way?

luanne commented 5 years ago

Closing based on https://community.neo4j.com/t/slow-writes-after-upgrade/9883/14?u=luanne_misquitta

dmiranda2791 commented 5 years ago

@luanne, we are experiencing this issue again :(.

Using Neo4j Enterprise we upgraded from v3.1.1 to 3.3.9 then to 3.5.9, we wanted to upgrade to the latest so Neo4j's upgrade guide suggested to go through v3.3.9 first.

This is our plugins config

# Graphaware framework
dbms.unmanaged_extension_classes=com.graphaware.server=/graphaware

# Graphaware uuid
com.graphaware.runtime.enabled=true

#UIDM becomes the module ID:
com.graphaware.module.UIDM.1=com.graphaware.module.uuid.UuidBootstrapper
# uuids on relationships too
com.graphaware.module.UIDM.relationship=com.graphaware.runtime.policy.all.IncludeAllBusinessRelationships

Plugins in used:

image

Our server instance has 30 GB RAM, I have set different configurations of heap and page cache but no luck so far. I have tried with 18GB Heap and 8Gb for page cache and also 8GB for heap and 18GB for page cache and even no setup at all. Our db size is about 14GB in the new version.

On each start I see a message on neo4j.log saying UUID Module will initialize, then a lot of GC messages appear on the debug.log. Similar to these

2019-09-11 16:33:11.966+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=109, gcTime=115, gcCount=1}
2019-09-11 16:34:57.481+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=104, gcTime=121, gcCount=1}
2019-09-11 16:35:31.614+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=108, gcTime=117, gcCount=1}
2019-09-11 16:36:10.861+0000 WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=115, gcTime=116, gcCount=1}

CPU Usage of the java command stays greater than 100% for as long as the process run. When I try to stop it using sudo service neo4j stop. A huge amount of these exceptions appear on the debug.log

2019-09-11 17:57:08.859+0000 WARN [c.g.t.e.s.KeepCalmAndCarryOn] An exception occurred while executing transaction This database is shutdown.
org.neo4j.graphdb.DatabaseShutdownException: This database is shutdown.
    at org.neo4j.kernel.impl.coreapi.CoreAPIAvailabilityGuard.assertDatabaseAvailable(CoreAPIAvailabilityGuard.java:56)
    at org.neo4j.graphdb.facade.spi.ClassicCoreSPI.beginTransaction(ClassicCoreSPI.java:185)
    at org.neo4j.kernel.impl.factory.GraphDatabaseFacade.beginTransactionInternal(GraphDatabaseFacade.java:691)
    at org.neo4j.kernel.impl.factory.GraphDatabaseFacade.beginTransaction(GraphDatabaseFacade.java:375)
    at org.neo4j.kernel.impl.factory.GraphDatabaseFacade.beginTx(GraphDatabaseFacade.java:363)
    at com.graphaware.tx.executor.single.SimpleTransactionExecutor.doExecuteInTransaction(SimpleTransactionExecutor.java:66)
    at com.graphaware.tx.executor.single.SimpleTransactionExecutor.executeInTransaction(SimpleTransactionExecutor.java:58)
    at com.graphaware.tx.executor.batch.IterableInputBatchTransactionExecutor.processQueue(IterableInputBatchTransactionExecutor.java:103)
    at com.graphaware.tx.executor.batch.IterableInputBatchTransactionExecutor.doExecute(IterableInputBatchTransactionExecutor.java:76)
    at com.graphaware.tx.executor.batch.DisposableBatchTransactionExecutor.execute(DisposableBatchTransactionExecutor.java:35)
    at com.graphaware.module.uuid.UuidModule.initialize(UuidModule.java:121)
    at com.graphaware.runtime.manager.ProductionTxDrivenModuleManager.initialize(ProductionTxDrivenModuleManager.java:57)
    at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.initializeIfAllowed(BaseTxDrivenModuleManager.java:128)
    at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.handleNoMetadata(BaseTxDrivenModuleManager.java:72)
    at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.handleNoMetadata(BaseTxDrivenModuleManager.java:39)
    at com.graphaware.runtime.manager.BaseModuleManager.loadMetadata(BaseModuleManager.java:143)
    at com.graphaware.runtime.manager.BaseModuleManager.loadMetadata(BaseModuleManager.java:125)
    at com.graphaware.runtime.TxDrivenRuntime.loadMetadata(TxDrivenRuntime.java:130)
    at com.graphaware.runtime.ProductionRuntime.loadMetadata(ProductionRuntime.java:80)
    at com.graphaware.runtime.BaseGraphAwareRuntime.startModules(BaseGraphAwareRuntime.java:154)
    at com.graphaware.runtime.TxDrivenRuntime.startModules(TxDrivenRuntime.java:146)
    at com.graphaware.runtime.ProductionRuntime.startModules(ProductionRuntime.java:70)
    at com.graphaware.runtime.BaseGraphAwareRuntime.start(BaseGraphAwareRuntime.java:134)
    at com.graphaware.runtime.bootstrap.RuntimeKernelExtension.lambda$start$0(RuntimeKernelExtension.java:117)
    at java.lang.Thread.run(Thread.java:748)
2019-09-11 17:57:08.859+0000 WARN [c.g.t.e.b.IterableInputBatchTransactionExecutor] Rolled back transaction for batch number 678751
2019-09-11 17:57:08.859+0000 WARN [c.g.t.e.b.IterableInputBatchTransactionExecutor] Throwing away the head of the queue as the transaction seems to have failed before polling...

I have no other option than sudo kill -9 <id> to try a new configuration or solution.

dmiranda2791 commented 5 years ago

I decided to change the AWS Instance type and after about 2 hours it let me do write queries. I had dbms.logs.debug.level=DEBUG and I saw a lot of transactions being started and committed.

I was not monitoring the first time so I decided to take a risk and restarted the db instance. The whole initialization process started for the module again.

There were a lot of transactions being started and committed. I saw to long running sequences, I guess one for nodes and one for relationships. The first sequence finished and the second one started immediately. Both sequences printed a message at the end:

I guess this re-initialization should not start on each startup of the DB, I did not changed any module specific configuration before restart; even so, the initialization started.

My goal now is to find a way to avoid re-initialization on each start if possible, so that we can restart the instance with confidence. I guess that the more data you have, the longer this process takes.

Here's a summary of neo4j.log

2019-09-11 22:07:34.252+0000 INFO  ======== Neo4j 3.5.9 ========
2019-09-11 22:07:34.260+0000 INFO  Starting...
2019-09-11 22:07:35.311+0000 INFO  Initiating metrics...
2019-09-11 22:07:37.016+0000 INFO  GraphAware Runtime enabled, bootstrapping...
2019-09-11 22:07:37.042+0000 INFO  Bootstrapping module with order 1, ID UIDM, using com.graphaware.module.uuid.UuidBootstrapper
2019-09-11 22:07:37.109+0000 INFO  Relationship Inclusion Policy set to com.graphaware.common.policy.inclusion.composite.CompositeRelationshipInclusionPolicy@612003a1
2019-09-11 22:07:37.118+0000 INFO  Registering module UIDM with GraphAware Runtime.
2019-09-11 22:07:37.119+0000 INFO  GraphAware Runtime bootstrapped, starting the Runtime...
2019-09-11 22:07:37.384+0000 INFO  Starting GraphAware...
2019-09-11 22:07:37.385+0000 INFO  Loading module metadata...
2019-09-11 22:07:37.386+0000 INFO  Loading metadata for module UIDM
2019-09-11 22:07:37.558+0000 INFO  Module UIDM seems to have been registered before, metadata loaded successfully.
2019-09-11 22:07:37.558+0000 INFO  Module UIDM seems to have changed configuration since last run, will try to re-initialize...
2019-09-11 22:07:37.559+0000 INFO  InitializeUntil set to 9223372036854775807 and it is 1568239657559. Will re-initialize.
2019-09-11 22:07:37.579+0000 DEBUG Starting a transaction for batch number 1
2019-09-11 22:07:39.695+0000 DEBUG Committed transaction for batch number 1
...
2019-09-11 22:30:30.560+0000 DEBUG Starting a transaction for batch number 8430
2019-09-11 22:30:30.575+0000 DEBUG Committed transaction for batch number 8430
2019-09-11 22:30:30.575+0000 DEBUG Successfully executed 8429061 (out of 8429061 ) steps in 8430 batches
2019-09-11 22:30:30.576+0000 DEBUG Starting a transaction for batch number 1
2019-09-11 22:30:32.577+0000 DEBUG Committed transaction for batch number 1
...
019-09-11 23:59:16.288+0000 DEBUG Successfully executed 8616949 (out of 8616949 ) steps in 8617 batches
2019-09-11 23:59:16.290+0000 INFO  Module metadata loaded.
2019-09-11 23:59:16.290+0000 INFO  Starting transaction-driven modules...
2019-09-11 23:59:16.290+0000 INFO  Transaction-driven modules started.
2019-09-11 23:59:16.290+0000 INFO  There are no timer-driven runtime modules. Not scheduling any tasks.
2019-09-11 23:59:16.290+0000 INFO  GraphAware started.
2019-09-11 23:59:16.290+0000 INFO  GraphAware Runtime automatically started.
2019-09-11 23:59:16.300+0000 INFO  Mounting GraphAware Framework at /graphaware
2019-09-11 23:59:16.308+0000 INFO  Will try to scan the following packages: {com.**.graphaware.**,org.**.graphaware.**,net.**.graphaware.**}
2019-09-11 23:59:17.331+0000 DEBUG Initializing com.graphaware.server.tx.LongRunningTransaction
2019-09-11 23:59:17.347+0000 DEBUG Initializing com.graphaware.server.tx.LongRunningTransaction
2019-09-11 23:59:18.063+0000 DEBUG Initializing com.graphaware.server.tx.LongRunningTransaction
2019-09-11 23:59:18.187+0000 DEBUG Initializing com.graphaware.server.tx.LongRunningTransaction
2019-09-11 23:59:18.474+0000 DEBUG Initializing com.graphaware.server.tx.LongRunningTransaction
2019-09-11 23:59:18.490+0000 DEBUG Initializing com.graphaware.server.tx.LongRunningTransaction
2019-09-11 23:59:18.609+0000 INFO  Remote interface available at http://localhost:7474/