Closed jnc74743 closed 10 months ago
Using CCM as a method to obtain VM status leads to permission issues within condor_startd:
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: Uncaught exception!!! Calling stack is:
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: LC::Exception::throw_error called at /usr/lib/perl/EDG/WP4/CCM/CacheManager.pm line 14
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Return from pipe Handler
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Calling pipe Handler <Standard Error Handler> for Pipe end=65538 <Standard Error>
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: 1
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: EDG::WP4::CCM::CacheManager::_check_type called at /usr/lib/perl/EDG/WP4/CCM/CacheManager.pm line 116
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: EDG::WP4::CCM::Cache
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Return from pipe Handler
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Calling pipe Handler <Standard Error Handler> for Pipe end=65538 <Standard Error>
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: Manager::new called at /usr/lib/perl/EDG/WP4/CCM/Options.pm line 192
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: EDG::WP4::CCM::Options::setCCMConfig called at /usr/lib/p
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Return from pipe Handler
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Calling pipe Handler <Standard Error Handler> for Pipe end=65538 <Standard Error>
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: erl/EDG/WP4/CCM/Options.pm line 225
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: EDG::WP4::CCM::Options::getCCMConfig called at /usr/lib/perl/EDG/WP4/CCM/CLI.pm line 108
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK:
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Return from pipe Handler
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Calling pipe Handler <Standard Error Handler> for Pipe end=65538 <Standard Error>
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: EDG::WP4::CCM::CLI::action_show called at /usr/lib/perl/EDG/WP4/CCM/Options.pm line 381
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: EDG::WP4::CCM::Options::action called
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Return from pipe Handler
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Calling pipe Handler <Standard Error Handler> for Pipe end=65538 <Standard Error>
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: at /usr/sbin/ccm line 50
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: WN_HEALTHCHECK: *** No permission for data directory (directory /var/lib/ccm/data)
Jan 08 16:51:19 host-172-16-114-104 condor_startd[19007]: Return from pipe Handler
Jan 08 16:51:19 host-172-16-114-104 condor_master[18968]: enter Daemons::CheckForNewExecutable
Jan 08 16:51:19 host-172-16-114-104 condor_master[18968]: Time stamp of running /usr/sbin/condor_master: 1695903252
I propose we either revert to parsing the motd or consider fixing the ccm permissions
virt-what
then?
Unfortunately one of the dependencies for virt-what
(libvirt
) requires root privileges. We shouldn't give the condor
user this level of permission so I think querying the metadata endpoint for OpenStack objects should work for what we need.
fatal_ext
spelling mistake