kubernetes-statefulset-agent-using-pvc: ctmping fails from the server

anshul07 commented 3 years ago

Hi,

I am following the tutorial kubernetes-statefulset-agent-using-pvc to run control m agents on Kubernetes and have followed all the steps as mentioned in the read me.

I am successfully able to run the command kubectl exec -it statefulset-agent-0 -- tcsh -c ag_ping but when I try ctmping -HOSTID statefulset-agent-0 on the server it fails with the error unknown host as the statefulset-agent- doesn't exist Is there same sort of configuration (networking) that is required?

As we are using the ConnectionInitiation=AgentToSever does it mean that everytime server and agent communication needs to happen agent will make a connection and hence server doesn't need to know about the the location of agent as long as agent is able to communicate with server?

anshul07 commented 3 years ago

@codytrey could you please help with this?

codytrey commented 3 years ago

Hi @anshul07

Sorry I couldn't reply sooner (I've been out sick). You are correct that in this mode the server does not need to know the name of the Control-M/Agent. However, a new connection is not necessarily opened each time that the agents needs to send data to the server, nor does the server have to wait until the agent reopens the connection because the agent should keep the connection active (ie. persistent mode)

It is expected that ctmping -HOSTID statefulset-agent-0 will fail was ctmping will always retry to resolve the HOSTID. For this configuration, this is okay and can be ignored since the ag_ping is successful.

Please let me know if that helps, if so close the issue. If you have additional questions, depending on what they are I may need to ask you to open a case with us on the BMC Support Central page so that we could discuss in more detail.

Thanks! Cody

anshul07 commented 3 years ago

Thanks Cody for your reply. It seems it is trying to validate communication from server to agent as well as the start up script is failing for us. I tried doing the telnet to 7005 port of our server and it worked fine. I am using automation api 9.20.100. Is it possible that the issues lies there?

Also on the server logs there are many entries for "unable to resolve host"

Also, curious to know that if 7006 of agent port is not being used then why do we need to define it in stateful.yaml?

│ info:    Making SSL trust all certificates and all hostnames                                                                                                                                                                                                                                                                                                                                                                                              │
│ info:    OnPrem provision delivery mode                                                                                                                                                                                                                                                                                                                                                                                                                   │
│ info:    setting server to agent port: 7006                                                                                                                                                                                                                                                                                                                                                                                                               │
│ info:    setting agent to server port: 7005                                                                                                                                                                                                                                                                                                                                                                                                               │
│ info:    setting agent name (alias): statefulset-agent-0                                                                                                                                                                                                                                                                                                                                                                                                  │
│ info:    setting primary Control-M Server: 10.228.158.22                                                                                                                                                                                                                                                                                                                                                                                                  │
│ info:    setting authorized Control-M Server host                                                                                                                                                                                                                                                                                                                                                                                                         │
│ info:    setting agent communication type to persistent                                                                                                                                                                                                                                                                                                                                                                                                   │
│ info:    agent configuration ended. restarting agent                                                                                                                                                                                                                                                                                                                                                                                                      │
│ info:    adding newly active agent to Control-M Server                                                                                                                                                                                                                                                                                                                                                                                                    │
│ info:    agent - server connection can't be validated from both sides (connection timeout)                                                                                                                                                                                                                                                                                                                                                                │
│ error:   Error setting up the image                                                                                                                                                                                                                                                                                                                                                                                                                       │
│ error:   agent - server connection can't be validated from both sides (connection timeout) (7189)                                                                                                                                                                                                                                                                                                                                                         │
│ debug:   setup failed: exit code: 21 for '"/home/controlm/bmcjava/bmcjava-V2/bin/java" -jar /home/controlm/.ctm/control-m.services.provision-9.20.100.jar -image "" -agent_tag "" -server https://10.228.158.10:8446/automation-api -action setup -environment prod -ctms "CTM_DEV_22_2" -name "statefulset-agent-0" -port "7006" -cert 0 -file "agent_configuration.json"'                                                                               │
│ Running in agent container                                                                                                                                                                                                                                                                                                                                                                                                                                │

Suryadevaraj commented 3 years ago

Hi @codytrey facing a similar issue as @anshul07 any update on this? info: Making SSL trust all certificates and all hostnames info: Annotation fields specified: {subject: 'test', description: 'test'} info: OnPrem provision delivery mode info: setting server to agent port: 7006 info: setting agent to server port: 7005 info: setting agent name (alias): statefulset-agent-0 info: setting primary Control-M Server: ** info: setting authorized Control-M Server host info: setting agent communication type to persistent info: agent configuration ended. restarting agent info: adding newly active agent to Control-M Server info: agent - server connection can't be validated from both sides (connection timeout) error: Error setting up the image error: agent - server connection can't be validated from both sides (connection timeout) (7189)

admbm96 commented 1 year ago

We have worked around this issue but having the agent setup to use the FQDN and then adding the actual Control-M Server name into CTMPERMHOSTS using ctmcfg in our [container_agent_startup.sh. We provision the agent in Dockerfile and then when the pod is started, container_agent_startup.sh runs to add to CTM/

echo 'Updating CTMPERMHOSTS file' ctmcfg -table CONFIG -action update -parameter CTMPERMHOSTS -value "$CTM_SERVER|$CTM_SERVER-HA|$CTM_SERVER.dtl.int|$CTM_SERVER-HA.dtl.int"

controlm / automation-api-community-solutions

kubernetes-statefulset-agent-using-pvc: ctmping fails from the server #96