Closed AlanSimmons closed 3 months ago
@AlanSimmons A more fine-grained and reliable approach I can suggest is to monitor the startup logging. Let the start.sh
output all the logging details from log/debug.log
and when you see lines like below:
2024-08-01 16:30:57.823+0000 INFO [o.n.b.p.c.c.n.SocketNettyConnector] Bolt enabled on 0.0.0.0:7687.
2024-08-01 16:30:57.823+0000 INFO [o.n.b.BoltServer] Bolt server started
2024-08-01 16:30:57.823+0000 INFO [o.n.s.A.ServerComponentsLifecycleAdapter] Starting web server
2024-08-01 16:30:59.164+0000 INFO [o.n.s.CommunityNeoWebServer] Remote interface available at http://localhost:7474/
2024-08-01 16:30:59.164+0000 INFO [o.n.s.A.ServerComponentsLifecycleAdapter] Web server started.
2024-08-01 16:30:59.169+0000 INFO [o.n.g.f.DatabaseManagementServiceFactory] id: 8B844E5E83F71F8E06C685C8B86755E3D9E2E14EF1B170FA76C4246768E05127
Then do a grep
and see if there are any java errors with Caused by:
. If NO errors, change to read-only
mode.
Because even though the logging says the server is started, it's NOT guaranteed there's no java errors that prevent the normal lifecycle.
at org.neo4j.kernel.impl.api.index.IndexingService.dontRebuildIndexesInReadOnlyMode(IndexingService.java:361) ~[neo4j-kernel-5.11.0.jar:5.11.0]
at org.neo4j.kernel.impl.api.index.IndexingService.lambda$start$3(IndexingService.java:303) ~[neo4j-kernel-5.11.0.jar:5.11.0]
at org.neo4j.kernel.impl.api.index.IndexMapReference.modify(IndexMapReference.java:48) ~[neo4j-kernel-5.11.0.jar:5.11.0]
at org.neo4j.kernel.impl.api.index.IndexingService.start(IndexingService.java:281) ~[neo4j-kernel-5.11.0.jar:5.11.0]
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:348) ~[neo4j-common-5.11.0.jar:5.11.0]
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:92) ~[neo4j-common-5.11.0.jar:5.11.0]
at org.neo4j.kernel.database.AbstractDatabase.start(AbstractDatabase.java:162) ~[neo4j-kernel-5.11.0.jar:5.11.0]
... 11 more
Suppressed: org.neo4j.kernel.lifecycle.LifecycleException: Exception during graceful attempt to stop partially started component. Please use non suppressed exception to see original component failure.
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:355) ~[neo4j-common-5.11.0.jar:5.11.0]
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:92) ~[neo4j-common-5.11.0.jar:5.11.0]
at org.neo4j.kernel.database.AbstractDatabase.start(AbstractDatabase.java:162) ~[neo4j-kernel-5.11.0.jar:5.11.0]
at org.neo4j.dbms.database.DatabaseLifecycles.startDatabase(DatabaseLifecycles.java:123) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.dbms.database.DatabaseLifecycles.initialiseDefaultDatabase(DatabaseLifecycles.java:89) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.dbms.database.DatabaseLifecycles$DefaultDatabaseStarter.start(DatabaseLifecycles.java:185) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:348) [neo4j-common-5.11.0.jar:5.11.0]
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:92) [neo4j-common-5.11.0.jar:5.11.0]
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:263) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:208) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.server.CommunityBootstrapper.createNeo(CommunityBootstrapper.java:38) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:187) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:99) [neo4j-5.11.0.jar:5.11.0]
at org.neo4j.server.CommunityEntryPoint.main(CommunityEntryPoint.java:30) [neo4j-5.11.0.jar:5.11.0]
Caused by: java.lang.NullPointerException: Cannot invoke "org.neo4j.scheduler.JobHandle.cancel()" because "this.usageReportJob" is null
at org.neo4j.kernel.impl.api.index.IndexingService.stop(IndexingService.java:444) ~[neo4j-kernel-5.11.0.jar:5.11.0]
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:353) ~[neo4j-common-5.11.0.jar:5.11.0]
... 13 more
2024-08-01 16:28:54.416+0000 INFO [o.n.b.p.c.c.n.SocketNettyConnector] Bolt enabled on 0.0.0.0:7687.
2024-08-01 16:28:54.417+0000 INFO [o.n.b.BoltServer] Bolt server started
2024-08-01 16:28:54.417+0000 INFO [o.n.s.A.ServerComponentsLifecycleAdapter] Starting web server
2024-08-01 16:28:55.363+0000 INFO [o.n.s.CommunityNeoWebServer] Remote interface available at http://localhost:7474/
2024-08-01 16:28:55.363+0000 INFO [o.n.s.A.ServerComponentsLifecycleAdapter] Web server started.
2024-08-01 16:28:55.367+0000 INFO [o.n.g.f.DatabaseManagementServiceFactory] id: 8B844E5E83F71F8E06C685C8B86755E3D9E2E14EF1B170FA76C4246768E05127
Statement of Problem
The start.sh script called by the Dockerfile for the neo4j server appears to set the mode for the server to read-only before the server has completed building indexes.
This seems to be an issue for Linux, but not Mac.
Likely solution
Add a wait loop that confirms that all indexes are complete before progressing.