DUNE / dist-comp

Action items for DUNE distributed computing, and common scripts that are used.
2 stars 0 forks source link

dunegpfrontend01 can't see jobs on Justin-prod-sched01.dune.hep.ac.uk or osgsub02.sdcc.bnl.gov #151

Closed StevenCTimm closed 3 months ago

StevenCTimm commented 3 months ago

Since its IDTOKEN was renewed on 5 March there have been exceptions saying that dunegpfrontend01 cannot successfully query the schedd on Justin-prod-sched01.dune.hep.ac.uk or on osgsub02.sdcc.bnl.gov.

INC000001169564 has been filed in ServiceNow at Fermilab to ask the dune global pool team to investigate further.

StevenCTimm commented 3 months ago

This adversely impacts the AWT testing features of justIN and also further DUNE production. It is possible to get jobs to match from this schedd in the interim by submitting large numbers of short jobs from one of the Fermilab schedds which will then leave glideins behind to which the AWT or justIN jobs can match.

StevenCTimm commented 3 months ago

Glideins are now being submitted again. There was a problem with the permissions on the token on our end that has now been fixed.