Closed DanielALS closed 5 years ago
Can you make sure there are no port conflicts in the appliances.xml? If possible, please attach a copy of your appliances.xml.
Thanks, I'm going have our network people check for conflicts. I don't feel comfortable posting the IP addresses here. I did diff the potential problem xml with the production copy and they are identical (which is good).
Assuming I get a final error determined, is it possible to come up with a more detailed exception message ? I'll try my hand at a PR, I won't feel bad if you reject it.
Cheers
No worries. Make sure the ports do not conflict with anything else.. There is not much information as to why Hz did not start; but most of the time this has to do with port conflicts and the like.
I can't find any port conflicts. The host I'm running my appliance on also runs Phoebus. Do clustered Appliances need to use the exact same snap shot ? That might be an issue.
Do clustered Appliances need to use the exact same snap shot ? I would say yes. I tend to upgrade the underlying clustering jars a little bit more frequently than the rest. And the internal protocols (specific to clustering) do change with versions of the jar; so I would lean towards using the same version for the cluster.
I installed the exact same snapshot and tomcat versions. Clustering now works.
I didn't separately test matching AA snapshot and matching Tomcat versions though.I suppose it's somewhat obvious as a best practice, but if you forget to check, it could be a gotcha.
I'm trying to add another AA instance on a host that will be part of a cluster of AA's. The original two appliances work fine and talk to each other.
I performed the new install using the "single-machine" install script, with a modified
appliances.xml
which contains all other instances.Any help trouble the addition of a new cluster instance is appreciated, as well, there might be an opportunity to include more details in the exception message.
Cheers
The following exception shows up in my
arch.log
file a few minutes after starting the appliance via the start script.10055 [Startup executor] INFO config.org.epics.archiverappliance.config.DefaultConfigService - Post startup for MGMT 10407 [Startup executor] INFO config.org.epics.archiverappliance.config.DefaultConfigService - Setting my cluster port base to 16670 and using interface X.X.196.50 # redacted the real IP 312753 [Startup executor] ERROR org.epics.archiverappliance.mgmt.MgmtPostStartup - Exception running post startup on the management app org.epics.archiverappliance.config.exception.ConfigException: Exception adding member to cluster at org.epics.archiverappliance.config.DefaultConfigService.postStartup(DefaultConfigService.java:530) at org.epics.archiverappliance.mgmt.MgmtPostStartup.run(MgmtPostStartup.java:44) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: Node failed to start! at com.hazelcast.instance.HazelcastInstanceImpl.<init>(HazelcastInstanceImpl.java:140) at com.hazelcast.instance.HazelcastInstanceFactory.constructHazelcastInstance(HazelcastInstanceFactory.java:196) at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:175) at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:125) at com.hazelcast.core.Hazelcast.newHazelcastInstance(Hazelcast.java:57) at org.epics.archiverappliance.config.DefaultConfigService.postStartup(DefaultConfigService.java:528) ... 8 more 312756 [Startup executor] INFO config.org.epics.archiverappliance.config.DefaultConfigService - Webapp is not in correct state for postStartup MGMT. It is in POST_STARTUP_RUNNING 312756 [Startup executor] INFO config.org.epics.archiverappliance.mgmt.MgmtPostStartup - Finished post startup for the mgmt webapp 312756 [Startup executor] INFO config.org.epics.archiverappliance.config.DefaultConfigService - Webapp is not in correct state for postStartup MGMT. It is in POST_STARTUP_RUNNING 312756 [Startup executor] INFO config.org.epics.archiverappliance.mgmt.MgmtPostStartup - Finished post startup for the mgmt webapp