Closed rvalle closed 4 years ago
@rvalle,
The address is it the same that you configured in the global setting 'host'?
And does it resolve to the same address on the hypervisor and the management server? Are any other options added to the command?
Yes! I can see the setting now. thanks!
Hi @DaanHoogland
After having properly configuring the host and management.network.cidr I don't get Cloudstack Manager to start.
I am getting the following exceptions when I restart after the changes
2020-03-09 10:50:03,280 INFO [c.c.c.ClusterManagerImpl] (main:null) (logid:) Start configuring cluster manager : ClusterManagerImpl
2020-03-09 10:50:03,280 INFO [c.c.c.ClusterManagerImpl] (main:null) (logid:) Cluster node IP : 10.71.0.254
2020-03-09 10:50:03,297 INFO [c.c.c.ClusterManagerImpl] (main:null) (logid:) Trying to connect to 10.71.0.254
2020-03-09 10:52:10,555 ERROR [c.c.c.ClusterManagerImpl] (main:null) (logid:) Unable to ping management server at 10.71.0.254:9090 due to ConnectException
java.net.ConnectException: Connection timed out
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:645)
at com.cloud.cluster.ClusterManagerImpl.pingManagementNode(ClusterManagerImpl.java:1140)
at com.cloud.cluster.ClusterManagerImpl.pingManagementNode(ClusterManagerImpl.java:1109)
at com.cloud.cluster.ClusterManagerImpl.checkConflicts(ClusterManagerImpl.java:1187)
....
at org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:186)
at org.apache.cloudstack.ServerDaemon.main(ServerDaemon.java:103)
2020-03-09 10:52:10,565 INFO [c.c.c.ClusterManagerImpl] (main:null) (logid:) Detected that another management node with the same IP 10.71.0.254 is considered as running in DB, however it is not pingable, we will continue cluster initialization with this management server node
2020-03-09 10:52:10,565 INFO [c.c.c.ClusterManagerImpl] (main:null) (logid:) Cluster manager is configured.
There seem to be some kind of confusion I think the manager is trying to connect to itself before has finished the startup process.
But then it mentions "another manager".
Any idea what could be going wrong?
Perhaps, in the install process, I should say which is the actual manager IP.
please check this 2020-03-09 10:52:10,565 INFO [c.c.c.ClusterManagerImpl] (main:null) (logid:) Detected that another management node with the same IP 10.71.0.254 is considered as running in DB, however it is not pingable, we will continue cluster initialization with this management server node
it might be remnance from a prior run or the old server might actually still be running.
@DaanHoogland Yes, I saw it. I actually think that the admin web showed 2 entries for management server even before changing the IP.
Note that I am writing an ansible playbook to install a cloudstack cluster, so, I re-create the whole thing again and again from the scratch.
I don't know who gets to decide how many management servers are there or which one is "me" in the setup process, but seems to get confused by my network setup, as I have several network adapters.
I am assuming that the table mshost is management servers, and I can see only one entry there:
mysql> select id,msid,service_ip,service_port,state from mshost;
+----+-----------------+-------------+--------------+-------+
| id | msid | service_ip | service_port | state |
+----+-----------------+-------------+--------------+-------+
| 1 | 209984346422944 | 10.71.0.254 | 9090 | Up |
+----+-----------------+-------------+--------------+-------+
1 row in set (0.00 sec)
for some reason the management server thinks that that is not "me".
perhaps after modifying the host ip in global config the management server does not shutdown properly when restarting the service. The state should definitely not be UP.
Also, is 9090 the right port? I access the management server on the default 8080 port.
Another question is whether it is possible to launch the setup process in a way that the right IP is chosen as management server, but I cannot see how is that IP selected.
After reading the installation guide I would have thought that this:
[root@manager ~]# ping $(hostname --fqdn)
PING manager.mgmt_net (10.71.0.254) 56(84) bytes of data.
64 bytes from manager.mgmt_net (10.71.0.254): icmp_seq=1 ttl=64 time=0.046 ms
64 bytes from manager.mgmt_net (10.71.0.254): icmp_seq=2 ttl=64 time=0.112 ms
would be enough for the setup to select the right IP for the manager, but perhaps it is not.
Any ideas?
9090 sounds right
8080 is the web-interface not the service, there are several ports in use and I always (try to) forget which is for what.
Can you try to update the state
to Down
?
I am testing a bit more, I have a lot of instability. not sure why yet.
I believe issue #3954 is getting in the way of my testing. Normally I reboot and reapply the roles to create the cluster again (indenpotency test) before concluding the setup of the cluster. I am going to disable it to properly check this.
@DaanHoogland yes, confirmed. It was #3954 that got my manager broken before attempting to change the host and management.network.cidr global values.
Re-tested without peforming any reboot as part of the clustiner installation process and changed this paramenters with no problem. The management server starts, with the new paramenters, there is only one.... all seems OK to me.
@DaanHoogland yes, confirmed. It was #3954 that got my manager broken before attempting to change the host and management.network.cidr global values. Hello. How can I change managemnet IP? I change it into db.properties but SSVM was create with old IP
My Manager server is connected to several networks, when adding a host, I can see in the logs that the wrong IP address is been used in the setup agent command
How does the management server determine its own IP?
I have also configured the global management network CIDR with management.network.cidr , yet the IP been passed to the host is the wrong one.