Closed tolecnal closed 2 years ago
Log lines and error messages are from the vspheredb
module, not vsphere
- but vspheredb
has no version 1.1.1
. Could you please re-check your versions, and also let me know the version ov your ìncubator
module?
The web error error (Unable to connect to unix domain socket) on the daemon page is telling us, that the daemon isn't running - which is confirmed by your log lines. They say: Got no proc dir (/proc/1504) for remote node
.
Is there SElinux or anything similar active? Is the web UI running on the very same host/container, or do you somehow connect from the outside of a dedicated container?
The following mariadb error suggests, that it terminated in an unclean way - which shouldn't happen. But let's address those errors step by step, we'll track this down.
Ah yes, I guess I was a bit cross eyed - we are talking about the vspheredb
module. And the currently installed version is 1.4.0.
As for incubator
this is running version 0,18.0, and director
is running 1.10.0.
The server is set up with apparmor, which I thought might be interfering, but I disabled apparmor and tested without it running, but got the same errors. Everything is running on the same server, no containers. Icinga2
and icingaweb2
are installed using the official icinga repositories, and all modules have been installed using the git method.
Just upgraded both modules, and the unix daemon error persists. However I am now seeing a new error in regards to the database:
06:57:44: [configwatch] Sending DB Config failed: SQLSTATE[HY000] [2002] No such file or directory in /usr/share/icingaweb2/library/vendor/Zend/Db/Adapter/Pdo/Abstract.php(145)
I performed a apt reinstall icingaweb2
which fixed the error about Abstract.php
. Strange that it would be missing.
The new logs from syslog looks like this:
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [configwatch] DB configuration loaded
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [db] sending DB config to child process
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [db] Running DB cleanup (this could take some time)
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [db] DB has been cleaned up
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [localdb] ready
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [api] launching server 1: vCenterId=1: https://username@some.vspherehost.example.com
Oct 12 05:45:51 icinga icinga-vspheredb[14292]: [api some.vspherehost.example.com (id=1)] Logged out
Oct 12 05:45:52 icinga icinga-vspheredb[14292]: [api pcc-some.vspherehost.example.com (id=1)] Cookies changed, storing new ones
Oct 12 05:46:16 icinga icingacli[14292]: ERROR: RuntimeException in /usr/share/icingaweb2/modules/incubator/vendor/gipfl/socket/src/UnixSocketInspection.php:64 with message: Got no proc dir (/proc/14077) for remote node
Oct 12 05:46:16 icinga systemd[1]: icinga-vspheredb.service: Main process exited, code=exited, status=1/FAILURE
Oct 12 05:46:16 icinga mariadbd[1166]: 2022-10-12 5:46:16 646 [Warning] Aborted connection 646 to db: 'vspheredb' user: 'vspheredb' host: 'localhost' (Got an error reading communication packets)
Oct 12 05:46:16 icinga systemd[1]: icinga-vspheredb.service: Failed with result 'exit-code'.
Oct 12 05:46:16 icinga systemd[1]: icinga-vspheredb.service: Consumed 2.313s CPU time.
There is clearly something wrong with accessing your /proc
filesystem, at least that's what I'm able to read from this message, combined with your description (no SELinux/Apparmor). But please don't ask me, why this happens :D
To catch this error, please apply the following patch:
--- a/library/Vspheredb/Daemon/RemoteApi.php
+++ b/library/Vspheredb/Daemon/RemoteApi.php
@@ -100,7 +100,16 @@ class RemoteApi implements EventEmitterInterface
$jsonRpc = new JsonRpcConnection(new StreamWrapper($connection));
$jsonRpc->setLogger($this->logger);
- $peer = UnixSocketInspection::getPeer($connection);
+ try {
+ $peer = UnixSocketInspection::getPeer($connection);
+ } catch (Exception $e) {
+ $jsonRpc->setHandler(new FailingPacketHandler(Error::forException($e)));
+ $this->loop->addTimer(3, function () use ($connection) {
+ $connection->close();
+ });
+ return;
+ }
+
if (!$this->isAllowed($peer)) {
$jsonRpc->setHandler(new FailingPacketHandler(new Error(Error::METHOD_NOT_FOUND, sprintf(
'%s is not allowed to control this socket',
This will not fix accessing your proc filesystem, but at lease the daemon will continue to run when the web tries to access it's socket.
Applied the patch, and that does indeed ensure that the daemon is kept running, even though it fails.
I decided to dig some further, as it is clear that something is blocking access to the /proc
file system. I backtraced my steps, and remembered that I had applied some OS hardening. Looking into this OS hardening, I went through the steps it takes and one of the steps is adding the option hidpid=2
to /etc/fstab
. This option limits access to the /proc
file system. Further information can be found here: https://linux-audit.com/linux-system-hardening-adding-hidepid-to-proc/
I then decided to remove this option from /etc/fstab
and rebooted the system. The module was now able to access the /proc
file system, so one step closer to something else :)
However I now saw that it was not able to access the socket file, as it said it no longer existed. I thought that /etc/tmpfiles.d/icinga-vspheredb.conf
was supposed to sort this out. So I manually created a new socket and confirmed with file
that it was indeed a socket file, and not a regular file. Restarted vspheredb
and monitored the log files, where we now see this:
Oct 12 07:23:26 icinga systemd[1]: Starting Icinga vSphereDB Daemon...
Oct 12 07:23:26 icinga systemd[1]: Started Icinga vSphereDB Daemon.
Oct 12 07:23:26 icinga icingacli[4319]: ERROR: ErrorException in /usr/share/php/Icinga/Application/ClassLoader.php:303 with message: require(/usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/RemoteApi.php): Failed to open stream: Permission denied
Oct 12 07:23:26 icinga systemd[1]: icinga-vspheredb.service: Main process exited, code=exited, status=1/FAILURE
Oct 12 07:23:26 icinga systemd[1]: icinga-vspheredb.service: Failed with result 'exit-code'.
Oct 12 07:23:57 icinga systemd[1]: icinga-vspheredb.service: Scheduled restart job, restart counter is at 33.
Oct 12 07:23:57 icinga systemd[1]: Stopped Icinga vSphereDB Daemon.
Oct 12 07:23:57 icinga systemd[1]: Starting Icinga vSphereDB Daemon...
Oct 12 07:23:57 icinga systemd[1]: Started Icinga vSphereDB Daemon.
Oct 12 07:23:57 icinga icingacli[4371]: ERROR: ErrorException in /usr/share/php/Icinga/Application/ClassLoader.php:303 with message: require(/usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/RemoteApi.php): Failed to open stream: Permission denied
Oct 12 07:23:57 icinga systemd[1]: icinga-vspheredb.service: Main process exited, code=exited, status=1/FAILURE
Oct 12 07:23:57 icinga systemd[1]: icinga-vspheredb.service: Failed with result 'exit-code'.
I have verified that the socket file has the same permissions as defined in /etc/tmpfiles.d/icinga-vspheredb
and also tested with chmod 0777
without that making any difference.
... and as it turned out, when I used git apply <patch>
it had reverted the permissions of the file vspheredb/library/Vspheredb/Daemon/RemoteApi.php
to 0640
which was too restrictive. Fixing the permissions on the file now yields a healthy vSphereDB Daemon Status
.
However the fact that the socket file disappeared on reboot and was not recreated is a bit worrying.
After performing two reboots now after creating the socket file manually, we seem to be on safe ground.
So to summarize, the original issue was caused by OS hardening where the /proc
file system was mounted with the option hidpid=2
which effectively restricts permissions to the file system. This can be overridden if so desired by limiting access to a specific group, or disabled completely by removing it from /etc/fstab
.
The issue can be closed - but you might consider including your patch into master for future users with similar problems.
Thank you @tolecnal for letting me know. The patch catching this error condition has been pushed, will be released with the next version
Description After installing the latest version of the vsphere module (ref: v1.1.1), I am able to complete the database setup and initial installation, and add a vsphere host which is able to retrieve information about the deployed hosts.
However when I go into vSphere Daemon Status it complains that it is unable to access the Unix Domain Socket. I have verified that the user running the module through icingacli has permissions on the socket file descriptor, and even gave it global read/write permissions (ref: 0666).
Even with global permissions, the daemon status page states the following:
If I look at the logs, in this case syslog I also see the following:
It seems to be an issue with handling of the unix domain socket through the IPL module, however I can't make out what the issue is, not knowing the IPL module.
Tried with both nginx and Apache as the front end.
System information: Ubuntu 22.04.1 LTS PHP 8.1.2 MariaDB 10.6.7-2ubuntu1.1 Icinga2 2.13.5-1.jammy Icingaweb2 2.11.1-1.jammy