quattor / template-library-standard

Apache License 2.0
2 stars 20 forks source link

cvmfs/client feature can set /software/components/profile/env to undefined in certain circumstances #45

Closed apdibbo closed 9 years ago

apdibbo commented 9 years ago

When neither of the two variables in this if statement (features/cvmfs/client.pan#L317) exist the following occurs on build:

Buildfile: /opt/aquilon/etc/build.xml
-verify.object.profile:
delete.object.dep:
compile.object.profile:
     [panc] 1/1 template(s) being processed
     [panc] validation error [/var/quattor/cfg/domains/workerplay/profiles/vm47.nubes.stfc.ac.uk.tpl]
     [panc] element at /{ components, { accounts, { active, true, dependencies, { pre, [ spma ] }, dispatch, true, groups, { nagios, { gid, 201 } }, kept_groups, { adm, , amanda, , amandabackup, , apache, , apacheds, , bin, , cimsrvr, , daemon, , dhcp, , dhcpd, , dip, , disk, , exim, , floppy, , ftp, , games, , gdm, , gopher, , haldaemon, , ident, , kmem, , ldap, , lemon, , lock, , lp, , mail, , mailnull, , man, , mem, , mysql, , nagios, , news, , nfsnobody, , nginx, , nobody, , nscd, , ntp, , oprofile, , pcap, , pegasus, , plugdev, , postdrop, , postfix, , postgres, , quagga, , radiusd, , radvd, , root, , rpc, , rpcuser, , rpm, , saslauth, , services, , sindes, , slocate, , smmsp, , squid, , sshd, , stap-server, , stapdev, , stapusr, , sys, , tcpdump, , tomcat, , tty, , usb, , users, , utmp, , uucp, , uuidd, , vboxusers, , vcsa, , webalizer, , wheel, , wine, , xfs,  }, kept_users, { abrt, , adm, , amanda, , amandabackup, , apache, , apacheds, , avahi, , avahi-autoipd, , bin, , daemon, , dbus, , dhcp, , dhcpd, , dialout, , distcache, , exim, , ftp, , games, , gdm, , gopher, , haldaemon, , halt, , hpglview, , ident, , ldap, , lemon, , lp, , mail, , mailnull, , man, , mysql, , nagios, , named, , news, , nfsnobody, , nginx, , nobody, , nscd, , nslcd, , ntp, , operator, , oprofile, , pcap, , pegasus, , postfix, , postgres, , quagga, , radiusd, , radvd, , root, , rpc, , rpcuser, , rpm, , saslauth, , screen, , services, , shutdown, , sindes, , smmsp, , squid, , sshd, , stap-server, , sync, , tcpdump, , tomcat, , tss, , utempter, , uucp, , uuidd, , vcsa, , webalizer, , xfs,  }, preserved_accounts, dyn_user_group, remove_unknown, false, rootpwd, $6$PDQDq/SO$UIclajVXXXLYlYWywVIfRxaRPvCRUfKTf7FsYOrzOZY3pQuWWISxITZf6EOCET3QjPazO0CpVyFVZVFcK7OiG0, shadowpwd, true, users, { nagios, { comment, nagios, groups, [ nagios ], homeDir, /home/tier1/nagios, shell, /bin/sh, uid, 201 } }, version, 14.10.0 }, altlogrotate, { active, true, configDir, /etc/logrotate.d, configFile, /etc/logrotate.conf, dependencies, { pre, [ spma ] }, dispatch, true, entries, { cernvmfs-fsck, { compress, true, create, true, frequency, weekly, ifempty, true, missingok, true, pattern, /var/log/cvmfs-fsck.log, rotate, 2 }, fetch-crl-cron, { compress, true, create, true, frequency, monthly, ifempty, true, missingok, true, pattern, /var/log/fetch-crl-cron.ncm-cron.log, rotate, 12 }, global, { compress, true, create, true, frequency, weekly, global, true, include, /etc/logrotate.d, rotate, 4 }, wtmp, { create, true, createparams, { group, utmp, mode, 0664, owner, root }, frequency, monthly, global, true, pattern, /var/log/wtmp, rotate, 1 } }, version, 14.10.0 }, authconfig, { active, true, dependencies, { pre, [ spma ] }, dispatch, true, enableforcelegacy, false, method, { files, { enable, true }, nis, { domain, csf, enable, true, servers, [ nis0.gridpp.rl.ac.uk, nis1.gridpp.rl.ac.uk, nis2.gridpp.rl.ac.uk ] } }, passalgorithm, sha512, safemode, false, startstop, true, usecache, true, usemd5, true, useshadow, true }, autofs, { active, true, dispatch, true, maps, { cvmfs, { enabled, true, mapname, /etc/auto.cvmfs, mountpoint, /cvmfs, preserve, true, type, program }, home-tier1, { enabled, true, entries, { _2a, { location, nfs1.gridpp.rl.ac.uk:/home/tier1/&, options, -rw,intr,rsize=8192,wsize=8192,actimeo=60,addr=130.246.183.1 } }, mapname, /etc/auto.home-tier1, mountpoint, /home/tier1, options, , preserve, false, type, file }, misc, { enabled, true, entries, { kickstart, { location, install.gridpp.rl.ac.uk:/kickstart, options, -ro,soft,rsize=8192,wsize=8192 } }, mapname, /etc/auto.misc, mountpoint, /misc, options, , preserve, false, type, file }, rutherford, { enabled, true, entries, { _2a, { location, &.data.rl.ac.uk:/rutherford/&, options, -intr,retrans=5,timeo=20,rw,hard,nosuid,rsize=8192,wsize=8192 } }, mapname, /etc/auto.rutherford, mountpoint, /rutherford, options, , preserve, false, type, file }, stage, { enabled, true, entries, { _2a, { location, &.stage.rl.ac.uk:/stage/&, options, -intr,retrans=5,timeo=20,rw,hard,nosuid,rsize=8192,wsize=8192,acregmin=30,acregmax=180,acdirmin=30,acdirmax=180 } }, mapname, /etc/auto.stage, mountpoint, /stage, options, , preserve, false, type, file } }, preserveMaster, false }, ccm, { active, true, cache_root, /var/lib/ccm, configFile, /etc/ccm.conf, debug, 0, dependencies, { pre, [ spma ] }, dispatch, true, force, 0, get_timeout, 30, lock_retries, 3, lock_wait, 30, profile, http://aquilon.gridpp.rl.ac.uk/profiles/vm47.nubes.stfc.ac.uk.json, retrieve_retries, 3, retrieve_wait, 30, version, 14.10.0, world_readable, 0 }, cdp, { active, true, configFile, /etc/cdp-listend.conf, dependencies, { pre, [ spma ] }, dispatch, true, fetch, /usr/sbin/ccm-fetch, fetch_smear, 30, version, 14.10.0 }, chkconfig, { active, true, dependencies, { pre, [ spma ] }, dispatch, true, service, { acpid, { on, , startstop, true }, atd, { on, , startstop, true }, auditd, { on, , startstop, true }, autofs, { on, , startstop, true }, cdp-listend, { on, , startstop, true }, crond, { on, , startstop, true }, cups, { off,  }, cvmfs, { on, , startstop, false }, edac, { on, , startstop, true }, fetch_2dcrl_2dcron, { on, , startstop, true }, irqbalance, { on, , startstop, true }, kudzu, { off,  }, lldpd, { on, , startstop, true }, ncm-cdispd, { on, , startstop, true }, network, { on, , startstop, true }, nrpe, { on, , startstop, true }, ntpd, { on,  }, psacct, { on,  }, rpcidmapd, { on, , startstop, true }, rsyslog, { on, , startstop, true }, sshd, { on, , startstop, true }, yum, { off, 2345 } } }, cron, { active, true, dependencies, { pre, [ spma ] }, dispatch, true, entries, [ { command, /usr/local/bin/healthcheck_systemtemp, frequency, * * * * *, name, healthcheck_systemtemp, user, root }, { command, sleep 340 && /usr/sbin/pakiti2-client, frequency, @reboot, name, pakiti2-startup, user, root }, { command, /usr/sbin/magdb-discover, name, magdb-discover, timing, { hour, 13, minute, 30, smear, 30 }, user, root }, { command, /usr/sbin/fetch-crl  --no-check-certificate --loc /etc/grid-security/certificates -out /etc/grid-security/certificates -a 24 --quiet, frequency, AUTO 3,9,15,21 * * *, name, fetch-crl-cron, user, root }, { command, /usr/local/bin/cvmfs_fsck, comment, Cron to check and fix cvmfs cache integrity, env, {  }, frequency, 00 09 * * *, name, cvmfs-fsck, user, root } ], securitypath, /etc, version, 14.10.0 }, dirperm, { active, true, dependencies, { pre, [ spma ] }, dispatch, true, paths, [ { owner, atlasprg:atlas, path, /pool/atlas/recovery/, perm, 0775, type, d }, { owner, atlasprg:atlas, path, /etc/atlas/, perm, 0775, type, d }, { owner, root:root, path, /etc/cms/PhEDEx, perm, 0755, type, d }, { owner, root:root, path, /etc/cms/JobConfig, perm, 0755, type, d }, { owner, cvmfs:cvmfs, path, /pool/cache/cvmfs2/, perm, 0755, type, d } ], version, 14.10.0 }, filecopy, { active, true, dependencies, { pre, [ spma ] }, dispatch, true, forceRestart, false, services, { _2fetc_2fatlas_2fsetup_2esh_2elocal, { backup, false, config, # This file is for local ATLAS overrides, please modify this via quattor
     [panc]
     [panc] , forceRestart, false, owner, atlasprg:atlas, perms, 0755 }, _2fetc_2fbash_5fcompletion_2ed_2fquattor, { backup, true, config, # Bash completion for unified quattor command wrapper
jouvin commented 9 years ago

Which variable? The line you mention is not an `if`` statement... Can you provide a extract of the faulty code?

apdibbo commented 9 years ago

The code snippet is:

'/software/components/profile/env' = {
    if (is_defined(VO_ATLAS_LOCAL_AREA)) {
        SELF['ATLAS_LOCAL_AREA'] = VO_ATLAS_LOCAL_AREA;
    };
jouvin commented 9 years ago

Why do you say the error is trigged by this piece of code. The error is a validation error and this happens at a stage where it is not possible to say which line is responsible for the misconfiguration... From the error you mention, it's hard to say that this is connected to this. The error looks a bit surprising to me, in particular element at /{ components, { accounts suggesting that the components are not under /software/componentsbut under /components...

apdibbo commented 9 years ago

Jrha advised me how to troubleshoot this by removing and replacing sections of the standard template to narrow down which section of code it was.

jouvin commented 9 years ago

Adding @jrha . I don't have enough context to help here but normally when you have a validation error, it gives the path that trigged the error and its value. Here is an example of validation error that I intentionally created (changing the 'perms' value to a number instead of a string):

     [panc] validation error [/scratch/jouvin/quattor/cdb/cfg/clusters/ipno/umd-3.0/profiles/ipngrid80.in2p3.fr.pan]
     [panc] validation requires type of 'string' but element is of type 'long'
     [panc] element path: '/software/components/filecopy/services/_2fetc_2fcvmfs_2fdefault_2elocal/perms'
     [panc] element value: 644
     [panc] type: 'string' [?:?]
     [panc] type: 'structure_filecopy' [/scratch/jouvin/quattor/cdb/cfg/quattor/14.10.0/components/filecopy/schema.pan:38.27-48.6]
     [panc] type: 'component_filecopy' [/scratch/jouvin/quattor/cdb/cfg/quattor/14.10.0/components/filecopy/schema.pan:51.27-55.1]
     [panc] path '/software/components/filecopy' bound to type component_filecopy in [/scratch/jouvin/quattor/cdb/cfg/quattor/14.10.0/components/filecopy/schema.pan:57.40-57.57]

In your case the error message looks a bit strange but the path suggests that the problem is with /, meaning the whole profile... I've the feeling that something weird happened in the way you are constructed the profile, leading to this. But I don't see what could be the link with the template you are mentionning.

jrha commented 9 years ago

This is the code that is at fault:

    if (is_defined(VO_ATLAS_LOCAL_AREA)) {
        SELF['ATLAS_LOCAL_AREA'] = VO_ATLAS_LOCAL_AREA;
    }

It checks if VO_ATLAS_LOCAL_AREA is defined, but assumes that ATLAS_LOCAL_AREA is a key of SELF.

If ATLAS_LOCAL_AREA is undefined very-bad-things-happen™, which as far as I can tell works like this.

  1. SELF['ATLAS_LOCAL_AREA'] is not defined
  2. SELF['ATLAS_LOCAL_AREA'] is therefore equivalent to something like SELF[undef]
  3. '/software/components/profile/env' is set to VO_ATLAS_LOCAL_AREA
  4. Validation error is triggered.
jrha commented 9 years ago

Scrap that, if '/software/components/profile/env' is undefined it triggers this error. Apparently you cannot modify a non-existent nlist/dict. Possibly the compiler could be more helpful.

jrha commented 9 years ago

'/software/components/profile/env' ?= dict(); before this block fixes this problem.

jrha commented 9 years ago

Reproduce with:

#'/software/components/profile/env/' ?= dict(); # Uncomment to fix horrible validation error.

'/software/components/profile/env/' = {
    if (is_defined(NONEXISTENT_VARIABLE)) {
        SELF['UNICORNS'] = NONEXISTENT_VARIABLE
    };
    SELF;
};