openbaton / generic-vnfm

Repository containing the source code of the generic VNFM
Apache License 2.0
20 stars 20 forks source link

CentOS 7 Zabbix Agent Installation Failure #51

Closed wittling closed 6 years ago

wittling commented 6 years ago

I have done the debugging to figure out why this is failing, as I explained on Gitter.

At least on my CentOS 7 VMs (uname below): Linux 367-dfl2-cent7.localdomain 3.10.0-514.16.1.el7.x86_64 #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

In the script user-data.sh, the following code is being used to determine the release: *yum install -y /lsb-release OS_DISTRIBUTION_RELEASE_MAJOR=$( lsb_release -a | grep "Release:" | awk -F'\t' '{ print $2 }' | awk -F'.' '{ print $1 }' )**

This package lsb-release I could not find in the yum repository (searching for lsb-release, lsb_release and lsb). When I tested it, the call failed. Therefore, a cascading effect occurs in which the release is not set, the ensuing repository call fails, and the calls to yum -y install zabbix zabbix-agent PARTLY fail.

The call to install zabbix works because there is a zabbix package in default repository. But this installs the server (which I am not sure is even necessary), and the version (at the moment) is always 2.2 of the server. The call to zabbix-agent - and the ensuing configuration of the conf fails - all fail.

Net effect is that the VM just has an instance of zabbix - unconfigured and not enabled or configured to start, which is probably a good thing because there could be (and should be) other zabbix servers. But the zabbix agent is never installed, never configured, and therefore never registers as an active host to the server, which of course causes everything depending on Zabbix to fail (monitoring, FM, ASE, Slicing, et al).

I will need to fix this in my testing and can fork off and propose a fix to this, if you'd like.

wittling commented 6 years ago

In testing this today, I am realizing that I see code in both the CloudInit as well as the user-data.sh script.

IMO you really should not be installing Zabbix in both of these - just one or the other. Personally, if it were me, I would install the ems service in the CloudInit, and leave the installation and configuration of Zabbix up to the user-data script.

There are some things I have found that I need to do to get Zabbix Agent working smoothly on CentOS. First, I install 3.2 of the package (CentOS 7). I also at times need to restart the agent (first time) to avoid an inexplicable Host Not Found error. Always goes away if I do a restart. It is these weirdities about Zabbix that I think make it more effective to install and configure in a script rather than in CloudInit.

lorenzotomasini commented 6 years ago

Hi @wittling

regarding the issue related with lsb-release:

In the script user-data.sh, the following code is being used to determine the release: *yum install -y /lsb-release OS_DISTRIBUTION_RELEASE_MAJOR=$( lsb_release -a | grep "Release:" | awk -F'\t' '{ print $2 }' | awk -F'.' '{ print $1 }' )

can you confirm that the fix made by @flaviomu solved that issue?


regarding user-data and CloudInit:

I am realizing that I see code in both the CloudInit as well as the user-data.sh script.

i don't understand where do you see that code twice. The Generic VNFM pushes the user-data.sh script to the cloud init of the chosen image, in your case clean CentOS7. So the cloud init of the VM runs that user data where we need to determine if it's a ubuntu or centos VM, in order to run different install scripts. We are using the user data mainly to install the EMS.

Zabbix agent is installed only if the monitoring ip is set. I agree that this part should be improved, in order to enable by default different monitoring systems thus doing these operations somewhere else. Even if it seems a simple change, it's a complex decision to make and will surely be taken into account in the next releases.

wittling commented 6 years ago

I didn't mean I saw the "same" code twice in the literal sense. I was referring to the fact that the cloud-config.yaml file installs the zabbix-agent as part of cloud-init.

https://github.com/openbaton/generic-vnfm/blob/master/src/main/resources/cloud-config.yaml#L7

Cloud-init sees that directive and installs those packages, right? Although I am not sure which version it installs. THEN - in user-data.sh it has logic to examine the major version of CentOS and install zabbix (in your earlier version at least) and zabbix-agent. Installing after previous installation can have some issues if you don't check for existing / prior installations of the package and decide to either leave them alone or remove and reinstall.

Then I see the same sed replacement logic (and zabbix agent restart) in the cloud-config.yaml file that I see in the user-data.sh, so are we not running sed replaces twice - once at Cloud Init which happens initially and then again when user-data.sh script is run?

So yesterday when I was testing this, I took that zabbix-agent out of the yaml file - and also the sed replace. Then I used your user-data.sh script except that I tweaked it a bit to ONLY install zabbix agent (not zabbix). I also changed it to install version 3.2 of the zabbix agent instead of 3.0. And I changed the sed replacement a little bit because the strings you are replacing are also in the comments of that file in addition to the actual parameter, so the sed I used takes a little more care to just adjust the actual parameter and not the logic itself.

But it does appear that I have it working nicely now. Today I plan to compile new versions of FMS and AES and give it a full test.

lorenzotomasini commented 6 years ago

I didn't mean I saw the "same" code twice in the literal sense. I was referring to the fact that the cloud-config.yaml file installs the zabbix-agent as part of cloud-init. https://github.com/openbaton/generic-vnfm/blob/master/src/main/resources/cloud-config.yaml#L7 Cloud-init sees that directive and installs those packages, right?

Nope, since this file is not being used as you can see from the code. We put it there when we tried to move to the full cloud config instead of user data in order to remove these OS dependent instructions. Unfortunately we were not able to make it full OS independent but we also did not want to waste the work done. This file is just a copy of the user data that will be reused when we will finally be able to move to cloud config. Regarding the version it installs i would look in the cloud config documentation because i don't know

But it does appear that I have it working nicely now. Today I plan to compile new versions of FMS and AES and give it a full test.

I will then close this issue