ssl-hep / ServiceX

ServiceX - a data delivery service pilot for IRIS-HEP DOMA
BSD 3-Clause "New" or "Revised" License
20 stars 21 forks source link

Accessing ATLAS data from EOS/Tier2 through UChicago AF #507

Open MoAly98 opened 2 years ago

MoAly98 commented 2 years ago

Hello!

I have some files on EOS that are accessible by custom permissions within ATLAS. I have managed with help from @oshadura to access the file with uproot.open(), however I still cannot access the file with uproot serviceX. My setup is:

from func_adl_servicex import ServiceXSourceUpROOT
eos_file    = "root://eosatlas.cern.ch//eos/atlas/atlascerngroupdisk/phys-higgs/HSG8/tH_v34_minintuples_v0/mc16a_nom/410470_AFII_user.vvecchio.27456668._000001.output.root"
ds = ServiceXSourceUpROOT([eos_file], treename="nominal_Loose")
data = (ds
    .Select("lambda e: {'lep_pt': e.leptons_pt,'lep_eta': e.leptons_eta,}")
    .AsAwkwardArray().value()
)

I end up with

Row: 26; Column: 4
Failed to transform input file root://eosatlas.cern.ch//eos/atlas/atlascerngroupdisk/phys-higgs/HSG8/tH_v34_minintuples_v0/mc16a_nom/410470_AFII_user.vvecchio.27456668._000001.output.root: file not found ([ERROR] Server responded with an error: [3010] Unable to give access - user access restricted - unauthorized identity used ; Permission denied ) 

I am not running explicitly any distributed code, so I believe this should have worked unless there is some permission issues with the proxy being used to access the file.

Are there any further checks I should do to understand/resolve this?

gordonwatts commented 1 year ago

Ok - I think I understand the problem. Let me try to rephrase and let me know if I've understood correctly:

Possible Workarounds

What could be done right now without changes to ServiceX?

  1. Make the files accessible to anyone that is a member of the ATLAS collaboration. This should make it accessible to ServiceX

Modifications to ServiceX

@BenGalewsky probably can point out a specific story

BenGalewsky commented 1 year ago

One variation on (1): ServiceX uses a captive service account to access ATLAS resources. We could add the owner of that account as membership in the private ATLAS group. Every user of that serviceX instance would have access to the files.

Maybe another pattern would be to deploy a private ServiceX at the AF that uses the account of a member of the private group as the service account.

But yes, passing tokens all the way through serviceX is a major (and ultimately necessary) change. See #321 for a skeletal story.

MoAly98 commented 1 year ago

Hi @BenGalewsky and @gordonwatts -- Thanks a lot for your replies :) You've got the story right, thanks a lot for summarising!

I'm sure you know how painful it would be to try and convince conveners to give full access to a group disk in ATLAS to the entire collaboration, but I can ask if this is possible. I think it could potentially be easier to ask the for access for the service account, so I can suggest both solutions to the group conveners. Am I right to assume the accoung is associated with Ilija? can you provide me an account name that would need access?

bbockelm commented 1 year ago

Hi!

I think we can reasonably request access to for a service account on a one-by-one basis. It might be difficult for a personal account, however.

For token-based access, I've put in a request for the ATLAS EOS folks to have this enabled.

Brian

vokac commented 1 year ago

This atlascerngroupdisk is a "non-Grid" (non-Rucio) storage area for local groups and permissions usually set to allow reading for all ATLAS users and writing for a specific group. I can read file mentioned above with just normal ATLAS user account and X.509 proxy (I'm not member of atlas-eos-access-phys-higgs e-group).

It is possible to check permission with

[vokac@lxplus.cern.ch]~% eos ls -l /eos/atlas/atlascerngroupdisk/ | grep phys-higgs
drwxr-x--+   1 root     zp       177290217647715 Sep  8 19:17 phys-higgs
[vokac@lxplus.cern.ch]~% eos acl -l /eos/atlas/atlascerngroupdisk/phys-higgs        
egroup:atlas-eos-access-phys-higgs:rwx

(for full picture it is also necessary to understand identity mapping, e.g. ATLAS EOS grid-mapfile, but that's basically same for all ATLAS users).

Because atlascerngroupdisk is not space managed by Rucio its directories & files can have arbitrary pemissions => technically we can define e.g. IAM policy that allows any ATLAS user to get token with read privileges from /eos/atlas/atlascerngroupdisk/phys-higgs (storage.read:/atlascerngroupdisk/phys-higgs/ scope), but for writing it would be necessary to synchronize all atlas-eos-access-* groups in the ATLAS IAM or do some fancy path based token identity mapping on the EOS ATLAS side.

Also this simple model with tokens would require quite a lot of knowledge on user side (e.g. user will be able to get token with storage.read:/atlascerngroupdisk/phys-higgs/, but not with storage.read:/ or storage.read:/atlascerngroupdisk/more-restricted-group-access/ ... this may be acceptable for R&D projects, but for production ServiceX may participate in a token exchange flows that could hide complexity of token content.

vokac commented 1 year ago

Anyway, I think that EOS-5460 issue needs to be resolved first for xroot access with tokens and meanwhile I would like to discus with EOS team token configuration for storage areas managed by Rucio. We should still start with EOS testbed, because this instance was not yet configured in a way to pass WLCG compliance tests. To be honest I also did not yet tested IAM scope policies, @bbockelm does CMS already configured / use IAM with scope policies for storage.*:$PATH ... so we don't have to worry about unexpected / undocumented security features that comes from this IAM configuration?

I mean, it may take some time before we are ready to use tokens for EOS ATLAS, but I would like to have something ready in Q1 2023.