DUNE / dist-comp

Action items for DUNE distributed computing, and common scripts that are used.
2 stars 0 forks source link

US_Clemson Clemson-Palmetto--need OSG ticket to add back the GLIDEIN_DUNEsite #97

Open StevenCTimm opened 10 months ago

StevenCTimm commented 10 months ago

We haven't been attempting to send anything to Clemson-Palmetto.. appears that the GLIDEIN_Site name changed when they changed the entry to a Hosted-CE last year,.

StevenCTimm commented 10 months ago

Have checked the GLIDEIN_Sites in our submit string.. we are requesting site name "Clemson" but the actual GLIDEIN_Site name is now Clemson-Palmetto so we are never requesting anything at Clemson-Palmetto. Need to adjust this in JustIN/AWT and in the site submit string of the jobs submitted via jobsub.

Andrew-McNab-UK commented 10 months ago

The problem is that they have disabled the XML entries and created an entry in the YAML but not added a GLIDEIN_DUNESite for the YAML entry. justIN ignores entries without DUNE site names when it is parsing the factory configs. If you look on the list of all sites, justIN says the last time the site was seen in the OSG config was March.

StevenCTimm commented 10 months ago

The above is true but doesn't explain why poms-submitted jobs weren't going there.. POMS has to adjust their string as well. WE can open a ticket to ask them to add the GLIDEIN_DUNESite again.

StevenCTimm commented 5 months ago

Now also glideins there are getting held for the error of nonexistent queue, have to file another ticket.

StevenCTimm commented 4 months ago

Some glideins actually got through into Clemson, (not from Justin but from jobsub) where a number of jobs went held due to a couple nodes that didn't have singularity namespaces sent.. nevertheless, progress. have to file a ticket to ask them to add GLIDEIN_DUNEsite.