Open jmchilton opened 6 years ago
Maybe to add to this: custom site-specific metric plugins loaded from a directory?
We have some extra facets of information that we could collect that aren't something that would be generally useful. Specific example, we store a 'build tag' in /etc/vgcn-release (indicating the version of the VM that the job is running on) that we could collect. It might be useful to have this tagged to the job for debugging purposes
@erasche For your specific example, you can probably use the <env />
plugin if you export that information as an environment variable.
@nsoranzo huh, interesting idea. I'll test that! Thanks
Ah, bit by an old 'bug'.
Looks like that's not an option in my case, htcondor cleans the environment before running a job. It's something we've noticed earlier, and causes problems for us in other cases as well, HTCondor overrides TMPDIR settings leading to us manually patching the upload tool. Please excuse the exasperated commit message.
example of the environment a condor job gets by by default (though galaxy does set the option to pass through its environment)
BATCH_SYSTEM=HTCondor
OMP_NUM_THREADS=1
PWD=/data/dnb01/condor-galaxy
SHLVL=1
TEMP=/var/lib/condor/execute/dir_3563
TMP=/var/lib/condor/execute/dir_3563
TMPDIR=/var/lib/condor/execute/dir_3563
_=/usr/bin/env
_CHIRP_DELAYED_UPDATE_PREFIX=Chirp
_CONDOR_ANCESTOR_1303=1890:1539771006:3869518212
_CONDOR_ANCESTOR_1890=3563:1539784048:1775810939
_CONDOR_ANCESTOR_3563=3564:1539784048:1021639213
_CONDOR_CHIRP_CONFIG=/var/lib/condor/execute/dir_3563/.chirp.config
_CONDOR_JOB_AD=/var/lib/condor/execute/dir_3563/.job.ad
_CONDOR_JOB_IWD=/data/dnb01/condor-galaxy
_CONDOR_JOB_PIDS=
_CONDOR_MACHINE_AD=/var/lib/condor/execute/dir_3563/.machine.ad
_CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_3563
_CONDOR_SLOT=slot1_1
vs the environment of the machine if you just log in normally:
HISTCONTROL=ignoredups
HISTSIZE=1000
HOME=/home/centos
HOSTNAME=vgcnbwc-training-beta-0.novalocal
LANG=en_US.UTF-8
LESSOPEN=||/usr/bin/lesspipe.sh %s
LOGNAME=centos
LS_COLORS=
MAIL=/var/spool/mail/centos
PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/centos/.local/bin:/home/centos/bin
PWD=/home/centos
SELINUX_LEVEL_REQUESTED=
SELINUX_ROLE_REQUESTED=
SELINUX_USE_CURRENT_RANGE=
SHELL=/bin/bash
SHLVL=1
SSH_CLIENT=132.230.68.5 8078 22
SSH_CONNECTION=132.230.68.5 8078 10.5.68.18 22
SSH_TTY=/dev/pts/1
USER=centos
_=/usr/bin/env
VGCN_RELEASE=CentOS 7.5.1804 VGGP vggp-v31-j95-9c1a332fb4d7-master
It broke the hostname plugin for us so we had to make some strange changes to that.
Two job metrics Trello cards existed and had good points that never seemed to make it to Github.
https://trello.com/c/uAcQYz5I/1606-job-metrics-reports-and-interpretation covered integration of job metrics with the reports app.
https://trello.com/c/XsQdqliU/1607-job-metrics-technical-enhancements
Newer Ideas: