WLCG-AuthZ-WG / common-jwt-profile

A repo for the WLCG Common JWT profile document
3 stars 8 forks source link

Scope of storage capabilities is ambiguous for SEs with tape storage #26

Open slithy opened 1 year ago

slithy commented 1 year ago

Scope of storage.read and storage.stage

As discussed at WLCG DOMA BDT Meeting (April 2023), CHEP (May 2023) and ATLAS S&C week (June 2023):

In the common-jwt-profile document, the scope of the storage.read and storage.stage capabilities should be amended:

Related questions

slithy commented 1 year ago

Pull request to address the first point above: https://github.com/WLCG-AuthZ-WG/common-jwt-profile/pull/27

abh3 commented 1 year ago

Some additional ambiguity. Say you have a system that may trigger a stage when a client attempts a read. However, the client does not have "stage" as a claim. What happens next is ambiguous. If you want a transparent system then read implies stage. However, if you want to prevent clients from staging files simply because they want to read them then you really want them to have a stage claim.

for stage followed by abort and evict. It would seem reasonable that if the client staged the file the client should also have the ability to abort the stage as well as evict the file. However, that is not clear when you consider the transparency point raised above.

Pin and unpin certainly should be separate from "stage" as it represents additional resource usage. However, as above if a client has pin privileges is unpin only w.r.t. to files the client pinned or all files?

I am in favor that a site can implement restrictions that are more severe than ones in a token and the site's policy should override the token's claims. Not doing so essentially says a site has surrendered complete control to the token issuer. I doubt many sites would accept that.

As for bulk requests I've seen it implemented in three ways -- two that you mention, the third is the request fails on the first failure encountered even when subsequent requests would succeed (i.e. partial failure). The reasoning is that recovery is much easier using the third scenario.

paulmillar commented 1 year ago

I also understand that dCache has implemented stage and read as separate capabilities (correct me if I am wrong).

dCache currently has only partial support for storage.stage. It treats storage.stage as a synonym for storage.read (as per the spec) but storage.stage does not authorise staging of that file. Instead, the existing stage authorisation processes are enforced.

paulmillar commented 1 year ago

At the DOMA meeting there was some discussion about whether claims in a token can override ACLs in the namespace. In particular, if a directory is set to prohibit deletion, should a token be able to override this? SE managers seem uneasy with this idea.

I find this comment rather strange.

As I understand it, the point of explicit AuthZ is to delegate AuthZ decisions (for some subtree within the namespace) to the VO. If the token says the bearer is authorised to delete a particular file then the storage system should honour that statement and delete the file when so requested.

Having the storage system overriding the VO's AuthZ decision dilutes the benefits from adopting explicit AuthZ.

beer4duke commented 6 months ago

Having the storage system overriding the VO's AuthZ decision dilutes the benefits from adopting explicit AuthZ.

Experiments define some SLAs directly with every storage endpoint like: for example never allow RAW data deletion at T0.

As the Storage endpoints are ultimately responsible for hosted data integrity: having VO's AuthZ decision overriding Storage endpoint SLAs revokes Storage endpoint responsibility for all its data.

But in case of deletion incident we all know that the storage endpoint will be blamed as usual and will have to spend expensive operations time to restore as much as it can.

I would not call this decision dilution but a mutually beneficial safety net.

paulmillar commented 6 months ago

I think this is an important point, and something that (I think) should be clarified and stated very clearly and explicitly.

A specific example scenario would be:

If a site is under some kind of commitment (SLA/MoU) to never delete certain data and a request comes in to delete said data, with a token (from the VO) that authorises that operation, what is the correct behaviour of the storage?

I think this generalises naturally to a broader question: are sites under any kind of MoU or SLA that could be in conflict with that site supporting tokens with explicit AuthZ statements?