BaldMansMojo / check_vmware_esx

chech_vmware_esx Fork of check_vmware_api.pl
GNU General Public License v2.0
124 stars 67 forks source link

Sessionfile does not get refreshed #22

Closed widhalmt closed 10 years ago

widhalmt commented 10 years ago

Hi!

I'm using the sessionfile for various checks at a customers. Sometimes it seems the session gets invalid and all checks become critical because they can't login any more.

I haven't looked at the code, but shouldn't the plugin drop the sessionfile when it can't login and use the login data provided to create a new sessionfile?

I'm using the "$HOSTADDRESS$" Macro as a name for the sessionfile because the VMware guy complained that there are far too many sessions when I stick to the documentation and use a combination of hostname and servicename. Could that be the cause for the issue?

Thanks for your help. Kind regards, Thomas

BaldMansMojo commented 10 years ago

Hi Thomas,

thinking about the problem led to the result that using a sessionfile is a bad idea. But first it looked good. Let's drill down the problem. The VMware Perl SDK was originally written to write little command line tools like the samples. In this case a sessionfile makes sense. You have a serialization of actions which can be handled in one session instead of having many sessions. And in this case you need only on sessionfile.

But if you look at monitoring you parallel actions instead of serial actions. And every action means a session. So you can't have the same sessionfile for one host because you (as mentioned) have parallel actions. And the session must be unique. (I think the original plugin with a fixed sessionfilename handed over via commandline wouldn't work either.)

So your VMware guy is right. You have lots of sessions. 15 checks per hosts with a smaller installation of 20 Vsphere hosts means 300 sessions. Nightmare when done via the Vcenter.

And totally unusable with mod_gearman and distributed worker processes.

I think it's best to kick out the sessionfile login or place a harsh warning in the help. The only way to use the sessionfile (I think) is putting all the checks in a shell script (startet by cron) and put the results as pasive check results into Nagios/Icinga/Whatever and let the script do the serialization.

But I do not htink it's feasible. What do you think?

Kind regards Martin

widhalmt commented 10 years ago

Thanks for the background information. Unfortunately I'm not that much of a VMware guy to know the internals of the ESXi.

I used the sessionfile to reduce load / logevents on the server in the first place. I implemented the checks without sessionfile but was told by the VMware people that this causes a login on the hosts every time a check is done. And every login get's an extra event in the logfiles. SInce they are not using logstash (yet ;-) ) , they tell me, they can't use the logs as much as they like because they get flooded by login entries.

So I tried using the sessionfile. For quite some time the VMware folks were happy but now and then all checks fail and one has to delete the sessionfile. Since I could not talk to the VMware admins today I had to rely on informations from the monitoring team who were not totally sure if there wasn't a reboot of the VCenter every time this happens. With the reboot being the cause for the stale session, not vice versa. At least it seams to happen more often on a testing ESXi farm where VCenter can be rebooted all the time than on production servers.

To sum up: The session file even with one file for every check on the VCenter works quite good. There is just one thing that breaks it from time to time and that I still can not figure out. I will try to get more information if there is a change happening whenever it breaks.

BaldMansMojo commented 10 years ago

Maybe that there is a way to check whether the session is used or not. But for this I have to dive into the deep of the Perl SDK implementation. I think it's based on libwww-perl. But I don't know when I have the time to do it because my collegue here at work has become to be a father and therefore he is at home for some weeks and I have to do his portion of work too. :-(

BaldMansMojo commented 10 years ago

Fixed. See README and history. Martin