Open ddbox opened 2 years ago
The glidein submitted ran and jobs matched to it, but it went held at the end of the job for reasons I have not yet figured out. I don't think they are tied to the bug fix but would like to hold the ticket open until I figure that out.
Steve, have you resolved the hold problem? Do you still consider this an open issue?
Yes, that problem is resolved, this ticket can be closed.
Email from Steve Timm:
When I previously reported that the NERSC configurations were correct in glideinWMS 3.9.4 there were two minor issues that I missed. In fact it may be that they were once ok and there is now a reversion.
this is what job.condor looks like under glideinwms 3.9.4 for a NERSC entry
There are 2 problems: you have $ENV(X509_USER_PROXY_BASENAME:) instead of $ENV(X509_USER_PROXY_BASENAME). (line 5 in the output below) for both transfer_Input_Files and encrypt_Input_files you have $ENV(X509_USER_PROXY":/dev/null) when it should be $ENV(X509_USER_PROXY:/dev/null)
Once these are manually fixed then the job is submitted correctly.
Please investigate.
[root@fermifactory01 entry_CMSHTPC_T3_US_NERSC_Cori_SL7]# more job.condor
File: job.condor
# +SciTokensFile = "$ENV(SCITOKENS_FILE)" environment = "JOB_TOKENS='/var/lib/gwms-factory/server-credentials/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/tokens.tgz' X509_USER_PROXY=$ENV(X509_USER_PROXY_BASENAME:)" Universe = grid Grid_Resource = batch slurm $ENV(GRID_RESOURCE_OPTIONS) --rgahp-glite ~/bosco_cori_haswell_sl7_htc9/glite $ENV(GLIDEIN_REMOTE_USERNAME)@cori.nersc.gov Executable = glidein_startup.sh copy_to_spool = True Arguments = $ENV(GLIDEIN_ARGUMENTS) transfer_Input_files = $ENV(IDTOKENS_FILE),/var/lib/gwms-factory/server-credentials/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/tokens.tgz,$ENV(X509_USER_PROXY":/dev/null) encrypt_Input_files = $ENV(IDTOKENS_FILE),/var/lib/gwms-factory/server-credentials/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/tokens.tgz,$ENV(X509_USER_PROXY":/dev/null) Transfer_Executable = True transfer_Output_files = WhenToTransferOutput = ON_EXIT stream_output = False stream_error = False +NodeNumber = 1 +remote_cerequirements = strcat("GlideinEntryName == \"", GlideinEntryName, "\"") +GlideinFactory = "$ENV(FACTORY_NAME)" +GlideinName = "$ENV(GLIDEIN_NAME)" +GlideinEntryName = "$ENV(GLIDEIN_ENTRY_NAME)" +GlideinEntrySubmitFile = "$ENV(GLIDEIN_ENTRY_SUBMIT_FILE)" +GlideinClient = "$ENV(GLIDEIN_CLIENT)" +GlideinFrontendName = "$ENV(GLIDEIN_FRONTEND_NAME)" +GlideinCredentialIdentifier = "$ENV(GLIDEIN_CREDENTIAL_ID)" +GlideinSecurityClass = "$ENV(GLIDEIN_SEC_CLASS)" +GlideinWebBase = "$ENV(GLIDEIN_WEB_URL)" +GlideinLogNr = "$ENV(GLIDEIN_LOGNR)" +GlideinWorkDir = "$ENV(GLIDEIN_STARTUP_DIR)" +GlideinSlotsLayout = "$ENV(GLIDEIN_SLOTS_LAYOUT)" +GlideinMaxWalltime = $ENV(GLIDEIN_MAX_WALLTIME) +fename = "$ENV(GLIDEIN_USER)" +GlideinProxyURL = "http://frontiercache.nersc.gov:3128/" periodic_remove = (isUndefined(GlideinSkipIdleRemoval)==True || GlideinSkipIdleRemoval==False) && (JobStatus==1 && isInteger($ENV(GLIDEIN_IDLE_LIFETIME)) && $ENV(GLIDEIN_IDLE_LIFETIME)>0 && ( time() - QDate)>$ENV(GLIDEIN_IDLELIFETIME)) || (JobStatus==2 && ((time() - EnteredCurrentStatus) > (GlideinMaxWalltime + 126060))) Notification = Never +Owner = undefined Log = /var/log/gwms-factory/client/user$ENV(GLIDEIN_USER)/glidein_gfactory_instance_fermifactory01/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/condoractivity$ENV(GLIDEINLOGNR)$ENV(GLIDEINCLIENT) .log Output = /var/log/gwms-factory/client/user$ENV(GLIDEIN_USER)/glidein_gfactory_instance_fermifactory01/entry_CMSHTPC_T3_US_NERSC_CoriSL7/job.$(Cluster).$(Process).out Error = /var/log/gwms-factory/client/user$ENV(GLIDEIN_USER)/glidein_gfactory_instance_fermifactory01/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/job.$(Cluster).$(Process).err Queue $ENV(GLIDEIN_COUNT)