glideinWMS / glideinwms

The glideinWMS Project
http://tinyurl.com/glideinwms
Apache License 2.0
16 stars 46 forks source link

New minor issue in job.condor config, glideinwms 3.9.4 #89

Open ddbox opened 2 years ago

ddbox commented 2 years ago

Email from Steve Timm:

When I previously reported that the NERSC configurations were correct in glideinWMS 3.9.4 there were two minor issues that I missed. In fact it may be that they were once ok and there is now a reversion.

this is what job.condor looks like under glideinwms 3.9.4 for a NERSC entry

There are 2 problems: you have $ENV(X509_USER_PROXY_BASENAME:) instead of $ENV(X509_USER_PROXY_BASENAME). (line 5 in the output below) for both transfer_Input_Files and encrypt_Input_files you have $ENV(X509_USER_PROXY":/dev/null) when it should be $ENV(X509_USER_PROXY:/dev/null)

Once these are manually fixed then the job is submitted correctly.

Please investigate.

[root@fermifactory01 entry_CMSHTPC_T3_US_NERSC_Cori_SL7]# more job.condor

File: job.condor

# +SciTokensFile = "$ENV(SCITOKENS_FILE)" environment = "JOB_TOKENS='/var/lib/gwms-factory/server-credentials/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/tokens.tgz' X509_USER_PROXY=$ENV(X509_USER_PROXY_BASENAME:)" Universe = grid Grid_Resource = batch slurm $ENV(GRID_RESOURCE_OPTIONS) --rgahp-glite ~/bosco_cori_haswell_sl7_htc9/glite $ENV(GLIDEIN_REMOTE_USERNAME)@cori.nersc.gov Executable = glidein_startup.sh copy_to_spool = True Arguments = $ENV(GLIDEIN_ARGUMENTS) transfer_Input_files = $ENV(IDTOKENS_FILE),/var/lib/gwms-factory/server-credentials/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/tokens.tgz,$ENV(X509_USER_PROXY":/dev/null) encrypt_Input_files = $ENV(IDTOKENS_FILE),/var/lib/gwms-factory/server-credentials/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/tokens.tgz,$ENV(X509_USER_PROXY":/dev/null) Transfer_Executable = True transfer_Output_files = WhenToTransferOutput = ON_EXIT stream_output = False stream_error = False +NodeNumber = 1 +remote_cerequirements = strcat("GlideinEntryName == \"", GlideinEntryName, "\"") +GlideinFactory = "$ENV(FACTORY_NAME)" +GlideinName = "$ENV(GLIDEIN_NAME)" +GlideinEntryName = "$ENV(GLIDEIN_ENTRY_NAME)" +GlideinEntrySubmitFile = "$ENV(GLIDEIN_ENTRY_SUBMIT_FILE)" +GlideinClient = "$ENV(GLIDEIN_CLIENT)" +GlideinFrontendName = "$ENV(GLIDEIN_FRONTEND_NAME)" +GlideinCredentialIdentifier = "$ENV(GLIDEIN_CREDENTIAL_ID)" +GlideinSecurityClass = "$ENV(GLIDEIN_SEC_CLASS)" +GlideinWebBase = "$ENV(GLIDEIN_WEB_URL)" +GlideinLogNr = "$ENV(GLIDEIN_LOGNR)" +GlideinWorkDir = "$ENV(GLIDEIN_STARTUP_DIR)" +GlideinSlotsLayout = "$ENV(GLIDEIN_SLOTS_LAYOUT)" +GlideinMaxWalltime = $ENV(GLIDEIN_MAX_WALLTIME) +fename = "$ENV(GLIDEIN_USER)" +GlideinProxyURL = "http://frontiercache.nersc.gov:3128/" periodic_remove = (isUndefined(GlideinSkipIdleRemoval)==True || GlideinSkipIdleRemoval==False) && (JobStatus==1 && isInteger($ENV(GLIDEIN_IDLE_LIFETIME)) && $ENV(GLIDEIN_IDLE_LIFETIME)>0 && ( time() - QDate)>$ENV(GLIDEIN_IDLELIFETIME)) || (JobStatus==2 && ((time() - EnteredCurrentStatus) > (GlideinMaxWalltime + 126060))) Notification = Never +Owner = undefined Log = /var/log/gwms-factory/client/user$ENV(GLIDEIN_USER)/glidein_gfactory_instance_fermifactory01/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/condoractivity$ENV(GLIDEINLOGNR)$ENV(GLIDEINCLIENT) .log Output = /var/log/gwms-factory/client/user$ENV(GLIDEIN_USER)/glidein_gfactory_instance_fermifactory01/entry_CMSHTPC_T3_US_NERSC_CoriSL7/job.$(Cluster).$(Process).out Error = /var/log/gwms-factory/client/user$ENV(GLIDEIN_USER)/glidein_gfactory_instance_fermifactory01/entry_CMSHTPC_T3_US_NERSC_Cori_SL7/job.$(Cluster).$(Process).err Queue $ENV(GLIDEIN_COUNT)

StevenCTimm commented 2 years ago

The glidein submitted ran and jobs matched to it, but it went held at the end of the job for reasons I have not yet figured out. I don't think they are tied to the bug fix but would like to hold the ticket open until I figure that out.

ddbox commented 2 years ago

Steve, have you resolved the hold problem? Do you still consider this an open issue?

StevenCTimm commented 2 years ago

Yes, that problem is resolved, this ticket can be closed.