Closed ThyLAW closed 3 years ago
discussion at #4238
The UEFI apparently doesn't matter, I don't care about the secureboot. The connection between the agent and libvirtd is what is causing this to crash. Error similar to #4659 , #4050, #3509 . Nothing seems to be able to fix it. Restarting agent, management, and libvirtd gives me:
/ elper] (main:null) (logid:) Default Builder inited. .Agent] (Agent-Handler-2:) (logid:) Reconnecting to host:localhost NioClient] (Agent-Handler-2:) (logid:) Connecting to localhost:8250 NioConnection] (Agent-Handler-2:) (logid:) Unable to connect to remote: is there a server running on port 8250
Unsure what it could be.
yes, there are some other tickets related to this but afaicsn it is not a duplicate (i might be wrong).
check yout host to see if anything is listening on that port @ThyLAW . I undestand you try to get a hyperconverged env up. check if there is nothing listeniong on the ports you need.
I get the following: netstat -tulnp | grep (8250, 9090, 8080)
tcp6 0 0 ::8250 :::* LISTEN 220183/java
netstat -tulnp | grep (8787, 8096, 3922) nothing netstat -tulnp | grep 22 tcp6 0 0 ::22 :::* LISTEN 1148/sshd
Java should mean CS right?
CS means java, not per se the other way around I think 220183 is the pid.
CS means java, not per se the other way around I think 220183 is the pid.
So I just verified that it is Cloudstack running on those ports. To test to see if Libvirtd even worked, I created a virtual machine and ran it and it seemed fine, meaning that libvirtd is working and I am able to use nested virtualization. This has now been narrowed down to some properties of libvirtd, properties of cloudstack agent, or the connection between them.
Here is the most recent error logs (journalctl -xe):
Feb 22 16:42:48 cloud.upbcist.priv polkitd[805]: Unregistered Authentication Agent for unix-process:40535:2843629 (system bus name :1.4803, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected fr Feb 22 16:42:48 cloud.upbcist.priv java[40541]: log4j:WARN No appenders could be found for logger (com.cloud.agent.AgentShell). Feb 22 16:42:48 cloud.upbcist.priv java[40541]: log4j:WARN Please initialize the log4j system properly. Feb 22 16:42:48 cloud.upbcist.priv java[40541]: log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.AgentShell] (main:) (logid:) Agent started Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.AgentShell] (main:) (logid:) Implementation Version is 4.15.0.0 Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.AgentShell] (main:) (logid:) agent.properties found at /etc/cloudstack/agent/agent.properties Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.AgentShell] (main:) (logid:) Defaulting to using properties file for storage Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.AgentShell] (main:) (logid:) Defaulting to the constant time backoff algorithm Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.utils.LogUtils] (main:) (logid:) log4j configuration found at /etc/cloudstack/agent/log4j-cloud.xml Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.AgentShell] (main:) (logid:) Using default Java settings for IPv6 preference for agent connection Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (main:) (logid:) id is 0 Feb 22 16:42:48 cloud.upbcist.priv java[40541]: ERROR [kvm.resource.LibvirtComputingResource] (main:) (logid:) uefi properties file not found due to: Unable to find file uefi.properties. Feb 22 16:42:48 cloud.upbcist.priv java[40541]: INFO [kvm.resource.LibvirtConnection] (main:) (logid:) No existing libvirtd connection found. Opening a new one Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [kvm.resource.LibvirtComputingResource] (main:) (logid:) No libvirt.vif.driver specified. Defaults to BridgeVifDriver. Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [kvm.resource.LibvirtComputingResource] (main:) (logid:) iscsi session clean up is disabled Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (main:) (logid:) Agent [id = 0 : type = LibvirtComputingResource : zone = default : pod = default : workers = 5 : host = localhost : port = 8250 Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [utils.nio.NioClient] (main:) (logid:) Connecting to localhost:8250 Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [utils.nio.Link] (main:) (logid:) Conf file found: /etc/cloudstack/agent/agent.properties Feb 22 16:42:49 cloud.upbcist.priv java[40541]: WARN [utils.nio.Link] (main:) (logid:) Failed to load keystore, using trust all manager Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [utils.nio.NioClient] (main:) (logid:) SSL: Handshake done Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [utils.nio.NioClient] (main:) (logid:) Connected to localhost:8250 Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:) (logid:) Could not read cpuinfo_max_freq, falling back on libvirt Feb 22 16:42:49 cloud.upbcist.priv sudo[40641]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/grep InitiatorName= /etc/iscsi/initiatorname.iscsi Feb 22 16:42:49 cloud.upbcist.priv systemd[1]: Created slice User Slice of root. `Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Attempting to create storage pool 4462e20b-ace8-417b-9125-8be5cf4cff4b (Filesystem) in libvirt Feb 22 16:42:49 cloud.upbcist.priv java[40541]: libvirt: Domain Config error : invalid connection pointer in virConnectGetVersion Feb 22 16:42:49 cloud.upbcist.priv java[40541]: ERROR [kvm.resource.LibvirtConnection] (Agent-Handler-1:) (logid:) Connection with libvirtd is broken: invalid connection pointer in virConnectGetVersion Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Found existing defined storage pool 4462e20b-ace8-417b-9125-8be5cf4cff4b, using it. Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:) Trying to fetch storage pool 4462e20b-ace8-417b-9125-8be5cf4cff4b from libvirt Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.serializer.GsonHelper] (Agent-Handler-1:) (logid:) Default Builder inited. Feb 22 16:42:49 cloud.upbcist.priv java[36735]: WARN [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-27:ctx-96bb3605) (logid:9e19cd8f) Unable to create attache for agent: Seq 0-0: { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1, [{"com. Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Proccess agent startup answer, agent id = 0 Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Set agent id 0 Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Startup Response Received: agent id = 0 Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (AgentShutdownThread:) (logid:) Stopping the agent: Reason = sig.kill Feb 22 16:42:49 cloud.upbcist.priv java[36735]: WARN [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) (logid:) Throwing away a request because it came through as the first command on a connect: Seq 0--1: { Cmd , MgmtId: -1, vi Feb 22 16:42:49 cloud.upbcist.priv java[36735]: WARN [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-14:null) (logid:) Throwing away a request because it came through as the first command on a connect: Seq 0-1: { Cmd , MgmtId: -1, via Feb 22 16:42:50 cloud.upbcist.priv libvirtd[36629]: 2021-02-22 21:42:50.926+0000: 36629: error : virNetSocketReadWire:1806 : End of file while reading data: Input/output error Feb 22 16:42:50 cloud.upbcist.priv systemd[1]: cloudstack-agent.service: main process exited, code=exited, status=1/FAILURE Feb 22 16:42:50 cloud.upbcist.priv java[36735]: INFO [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-15:null) (logid:) Connection from /127.0.0.1 closed but no cleanup was done. Feb 22 16:42:50 cloud.upbcist.priv systemd[1]: Unit cloudstack-agent.service entered failed state. Feb 22 16:42:50 cloud.upbcist.priv systemd[1]: cloudstack-agent.service failed.
Please let me know if you can figure anything out.
this one is a known issue but shouldn't hamper you, I've seen it in working envs.
Feb 22 16:42:48 cloud.upbcist.priv java[40541]: ERROR [kvm.resource.LibvirtComputingResource] (main:) (logid:) uefi properties file not found due to: Unable to find file uefi.properties.
this one is not good, but i'm not sure if it is related
Feb 22 16:42:49 cloud.upbcist.priv java[40541]: libvirt: Domain Config error : invalid connection pointer in virConnectGetVersion
Feb 22 16:42:49 cloud.upbcist.priv java[40541]: ERROR [kvm.resource.LibvirtConnection] (Agent-Handler-1:) (logid:) Connection with libvirtd is broken: invalid connection pointer in virConnectGetVersion
and later, this looks like the agent tries to identify itself as a known agent with id 0. I think the initialisation of the agent went wrong
Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (Agent-Handler-2:) (logid:) Startup Response Received: agent id = 0
Feb 22 16:42:49 cloud.upbcist.priv java[40541]: INFO [cloud.agent.Agent] (AgentShutdownThread:) (logid:) Stopping the agent: Reason = sig.kill
here it seems the unknown agent tried to do business as usual.
Feb 22 16:42:49 cloud.upbcist.priv java[36735]: WARN [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) (logid:) Throwing away a request because it came through as the first command on a connect: Seq 0--1: { Cmd , MgmtId: -1, vi
Feb 22 16:42:49 cloud.upbcist.priv java[36735]: WARN [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-14:null) (logid:) Throwing away a request because it came through as the first command on a connect: Seq 0-1: { Cmd , MgmtId: -1, via
Feb 22 16:42:50 cloud.upbcist.priv libvirtd[36629]: 2021-02-22 21:42:50.926+0000: 36629: error : virNetSocketReadWire:1806 : End of file while reading data: Input/output error
I think your best way forward is to clean the agent machine and re-add it to cloudstack. Check the host table in the DB as well to make sure there is no entry crooked for the agent.
I rolled the machine back to preconfiguration. I am now able to get access to port 8250 via "nc - v 192.168.236.2 8250" which was refused on the other version. I went in and generated uuid's, changed nic names, etc set it up how it should be. It was giving me the SSL Handshake error referenced in #4659 . I went and disabled the ca.auth setting in global settings and rebooted management and eliminated that error. However it is now crashing once again with the same errors. I am going to try installing this on ubuntu. I just have no idea what the issue could be.
thanks for the update @ThyLAW , I hope you fare well on this. Not many of us are using hyperv and it would be great if you can introduce some wisdom on that.
(First off sorry for posting on these forums w/ this issue, I didn't know the mail client existed) Update in case someone eventually sees this:
I got it working, however I completely scrapped Hyper-V. The workaround for nested virtualization I believed had worked, but I don't think the NAT network was setup correctly (as shown in my documentation) as I could ping the management server from an external network. I also would recommend NOT following the Quick Installation Guide . It will be necessary for installing sql but that is it. It completely leaves out the vlan and the bridge cloudbr1 that is needed, which is resulting in the "can not find nic name" error. Go through the documentation.
I have since reinstalled on VMWare using the NAT network provided on there using ESXi. It has run relatively smoothly and I have got it fully configured.
I am happy to have finally succeeded. I am currently creating a more in-depth beginning to completion documentation, as I have found no such resource online, and I only managed to complete this through hundreds of web-searches and scouring the documentation and forums. I hope to be able to publish that documentation online and allow for beginner users such as myself to get involved with Cloudstack.
Thank you, I am closing this issue.
Thanks @ThyLAW any changes Quick Installation Guide to propose?
@ThyLAW As i am in kind of the same situation, i would like to know if you had the chance to work on publication of your results for solving some of the issues you've mentioned?
@ThyLAW As i am in kind of the same situation, i would like to know if you had the chance to work on publication of your results for solving some of the issues you've mentioned?
Hello @Hudratronium I did not fix my issues with HyperV, I could not get it to work. I instead did a complete reinstall utilizing VMware and have succeeded. I had to reach out to my employer to allow me to share my documentation online. For further use than just reading it, please contact me. Also please contact me if there is anything you think should be added, fixed, or removed.
Please note that it is not entirely finished, as we are still working with CloudStack. Also note that this is for a NAT network, on only one computer. For having separate agents there will be some changes that need to be applied to NFS. We will be doing a production setup like that this month.
As for the issues, I did not fix it. It was just broken. I am not sure why, probably an issue with my NAT settings, which may have been solved by VMware.
Hope it helps,
[Apache Cloudstack Installation Documentation.docx] (https://github.com/ThyLAW/ApacheCloudstackDocumentation/tree/main)
@ThyLAW Thank you very much. Will take a closer look into it !
@ThyLAW Thank you very much. Will take a closer look into it !
No problem! Just fyi gonna host this on my github now so link will change to that.
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
NAT network from host give IP 192.168.236.2 gate way .1 255.255.255.0 . cloudbr0 given ip eth0 no statically assigned ip.
OS / ENVIRONMENT
Hyperv Gen 2 VM CentOS 7.9 KVM Agent, management server, and storage all on same VM Latest version of agent/cloudstack/etc
SUMMARY
Cloudstack agent was working successfully until attempted to do basic configuration of cloudstack. Everything was up and runnin gand fine prior. Agent failed spouting SSL errors, disabled auth in global settings, and fixed that, but it is still down. Now it gives me issues about not being able to find Uefi.properties and can not connect with libvirtd which is working. I went to the directory that uefi.properties should exist but it does not exist.
I also found that it automatically assigned my public and private nics wrong and had to change those to cloudbr0. I also had to generate uuids for the agent.properties file, which wasn't required in the quick installation, but they don't exist otherwise. I installed this twice, both coming up with similar issues on both vms.
STEPS TO REPRODUCE
Exact way I set this up up until I disabled SSL in global config. New Apache Documentatino.docx
EXPECTED RESULTS
ACTUAL RESULTS