Open rjams opened 5 years ago
not running
is a valid response from the monitoring action of the agent.
It means:
The agent successfully retrieved the server from the api
The agent successfully retrieved the floating ip from the api
The server
id in the floating ip did not match the servers id.
So the api is telling the agent that the ip address is not assigned to the server it expects.
Errors while retrieving either of server or floating ip are handled with different errors. So the only way I can see the error lying in the agent is if it identified the wrong server as its representation in the api. Currently the agent iterates over the servers in the api and checks if their public ip address is present on the machine it is running on. I can't see a false positive happening here.
Since it only happens rarely and without discernable trigger I am currently guessing that it is in fact the hetzner api which is returning wrong data. I am open for different ideas thought.
The ocf-resouce is running fine. Sometimes for a week without any trouble. But suddenly this error occured.
When this error shows up it occure more than once. What is the problem. Hetzner said: everything runs without a problem.