DUNE / dist-comp

Action items for DUNE distributed computing, and common scripts that are used.
2 stars 0 forks source link

Fix GLIDEIN_DUNESite entries for CMST1 and FermiGrid #35

Closed StevenCTimm closed 1 year ago

StevenCTimm commented 1 year ago

GLIDEIN_DUNESite for CMStier 1 should be US_FNAL (currently undefined GLIDEIN_DUNESite for FermiGrid should be US_FNAL_FermiGrid (currently US_FNAL)

Andrew-McNab-UK commented 1 year ago

I think it is a bad idea to have a site called just US_FNAL when there is another site called US_FNAL-xyz. People will naturally assume when they see US_FNAL that it covers the whole site. For example, you can create GGUS tickets for the Tier1 ("US_FNAL") but they're not the correct way of reporting problems about the FermiGrid site, yes?

kherner commented 1 year ago

See https://support.opensciencegrid.org/a/tickets/71942

Andrew-McNab-UK commented 1 year ago

Thanks: OSG have updated their config in GitHub and this has been picked up by justIN and is now being used for the site tests.

However, where FermiGrid jobs are still run directly rather than through the CE, then GLIDEIN_DUNESite=US_FermiGrid still. Presumably this is set "by hand" somewhere in the FermiGrid HTCondor config? The justIN generic jobs have an override for that, and use US_FNAL-FermiGrid instead, so justIN is ok. For the DUNE global pool, all the jobs will run in glideIns I think? So this mismatch will go away?

StevenCTimm commented 1 year ago

We can fix the mocked up glidein dune site in jobs that run direct. Let me know what it should be.

Steve


From: Andrew McNab @.> Sent: Monday, February 6, 2023 10:37 AM To: DUNE/dist-comp @.> Cc: Steven C Timm @.>; Author @.> Subject: Re: [DUNE/dist-comp] Fix GLIDEIN_DUNESite entries for CMST1 and FermiGrid (Issue #35)

Thanks: OSG have updated their config in GitHub and this has been picked up by justIN and is now being used for the site tests.

However, where FermiGrid jobs are still run directly rather than through the CE, then GLIDEIN_DUNESite=US_FermiGrid still. Presumably this is set "by hand" somewhere in the FermiGrid HTCondor config? The justIN generic jobs have an override for that, and use US_FNAL-FermiGrid instead, so justIN is ok. For the DUNE global pool, all the jobs will run in glideIns I think? So this mismatch will go away?

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_DUNE_dist-2Dcomp_issues_35-23issuecomment-2D1419382430&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=rdUmIMdGeHfr5my_CXonj62v4eQJ5ioNRLHNVZi3_rOb0VKly6dahGeTm3xUwVjA&s=-s9Rvr0S_l4LhkyVVF0TN4YVnm1N15uLLj4r-lIy3K0&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AGG4SOFSMACOWCCRMFKKAPTWWESGPANCNFSM6AAAAAAS4T5C3I&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=rdUmIMdGeHfr5my_CXonj62v4eQJ5ioNRLHNVZi3_rOb0VKly6dahGeTm3xUwVjA&s=BkFRf6pYVJj2qpzLfhoIDpGrKMvkgeXZnuw_rc1Kf7I&e=. You are receiving this because you authored the thread.Message ID: @.***>

Andrew-McNab-UK commented 1 year ago

Thanks: it should be US_FNAL-FermiGrid

StevenCTimm commented 1 year ago

RITM1633650 filed with my group (High Throughput Computing) to get this done.

StevenCTimm commented 1 year ago

This has now been done, closing.