dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
277 stars 132 forks source link

gPlazma: failed to create plugin "oidc": Duplicate key , when gplazma.oidc.provider has the same URL with different prefix #7540

Closed elenamplanas closed 2 months ago

elenamplanas commented 3 months ago

We have an instance with multiple VOs, running dCache 9.2.11

We've configured Token authentications defining the providers for each VO without problems, but now, when we want to configure the ATLAS T2 provider we faced a problem.

The gplazma.oidc.provider for the ATLAS T2 (IFAE) has the same URL than the one for ATLAS, but specifying another prefix.

The parameters defined for T1 are:

gplazma.oidc.provider!atlas=https://atlas-auth.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/data/atlas -authz-id="uid:42001 gid:50045 username:atprd001"
gplazma.oidc.provider!atlas-old=https://atlas-auth.web.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/data/atlas -authz-id="uid:42001 gid:50045 username:atprd001"

And the new ones for the T2 are:

gplazma.oidc.provider!ifae-old=https://atlas-auth.web.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/IFAEAtlasTier2 -authz-id="uid:42001 gid:50045 username:atprd001"
gplazma.oidc.provider!ifae=https://atlas-auth.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/IFAEAtlasTier2 -authz-id="uid:42001 gid:50045 username:atprd001"

The restart of gPlazma fails with following messages:

25 Mar 2024 10:27:31 (gPlazma) [] Audience ("aud") checking is suppressed for OP cms-old.  This makes dCache compatible with behaviour before version 8.2.0, but it also violates RFC "MUST" r
equirements and may have security implications.
25 Mar 2024 10:27:31 (gPlazma) [] Audience ("aud") checking is suppressed for OP cms.  This makes dCache compatible with behaviour before version 8.2.0, but it also violates RFC "MUST" requi
rements and may have security implications.
25 Mar 2024 10:27:31 (gPlazma) [] failed to create plugin "oidc": Duplicate key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-
old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:33 (gPlazma) [door:Xrootd-door05@xrootd-door05Domain:AAYUeMiAL7g Xrootd-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:34 (gPlazma) [door:Xrootd-door01@xrootd-door01Domain:AAYUeMiWkjA Xrootd-door01 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:35 (gPlazma) [Frontend-dccore15 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicate key https://atlas-auth.web.cern.ch/ (attempte
d merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:36 (gPlazma) [door:Xrootd-door05@xrootd-door05Domain:AAYUeMiv3sA Xrootd-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:37 (gPlazma) [WebDAV-ATLAST2-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicate key https://atlas-auth.web.cern.ch/ (atte
mpted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:38 (gPlazma) [door:Xrootd-door05@xrootd-door05Domain:AAYUeMjTrrg Xrootd-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:38 (gPlazma) [door:Xrootd-door05@xrootd-door05Domain:AAYUeMjMsag Xrootd-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:40 (gPlazma) [SRM-door03 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicate key https://atlas-auth.web.cern.ch/ (attempted mergi
ng values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:46 (gPlazma) [door:Xrootd-door01@xrootd-door01Domain:AAYUeMlJkmg Xrootd-door01 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:46 (gPlazma) [WebDAV-ATLAST1-door02 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicate key https://atlas-auth.web.cern.ch/ (atte
mpted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:46 (gPlazma) [door:GFTP-door03-AAYUeMlV8mA@gridftp-door03Domain GFTP-door03-AAYUeMlV8mA Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc":
 Duplicate key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:47 (gPlazma) [WebDAV-ATLAST1-door02 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicate key https://atlas-auth.web.cern.ch/ (atte
mpted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:48 (gPlazma) [door:Xrootd-door05@xrootd-door05Domain:AAYUeMlj7IA Xrootd-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:48 (gPlazma) [door:Xrootd-door06@xrootd-cmst1-local-door06Domain:AAYUeMlrBOg Xrootd-door06 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oid
c": Duplicate key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:50 (gPlazma) [door:Xrootd-door05@xrootd-door05Domain:AAYUeMmDpZg Xrootd-door05 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
25 Mar 2024 10:27:50 (gPlazma) [door:Xrootd-door01@xrootd-door01Domain:AAYUeMmGMfA Xrootd-door01 Login] Login attempt failed: internal gPlazma error: failed to create plugin "oidc": Duplicat
e key https://atlas-auth.web.cern.ch/ (attempted merging values ifae-old[https://atlas-auth.web.cern.ch/] and atlas-old[https://atlas-auth.web.cern.ch/])
[...]

Is there a way to define diferent "providers" with same URL?

Thanks! Elena

elenamplanas commented 3 months ago

There is a GGUS opened with the request for configure tokens for the spanish ATLAS T2 :

https://ggus.eu/index.php?mode=ticket_info&ticket_id=165953

paulmillar commented 3 months ago

Hi Elena,

An aside

First, each of the two examples you quote seem to define two providers. Here's an example that defines the two providers atlas and atlas-old:

gplazma.oidc.provider!atlas=https://atlas-auth.cern.ch/ -profile=wlcg [...]
gplazma.oidc.provider!atlas-old=https://atlas-auth.web.cern.ch/ -profile=wlcg [...]

These two definitions seem to be identical, apart from the choice of provider name (atlas vs atlas-old). I'm not 100% sure how this is working: perhaps one of these commented out, but the comments didn't make it into the issue?

If not then this must only be working because the two definitions are identical (or nearly identical). dCache must be selecting one of these two definitions. You're getting a "randomly" chosen one, which would actually be deterministic, but the rules are likely complicated, and could change when updating your Java- or dCache version.

In any case, I'd say this is a rather bad idea.

Restating the problem

I think you're interested in having two providers with the same URL. Here's an example of how I believe you would like to configure dCache:

gplazma.oidc.provider!atlas=https://atlas-auth.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/data/atlas [...]
gplazma.oidc.provider!ifae=https://atlas-auth.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/IFAEAtlasTier2 [...]

Unfortunately, this cannot work with dCache currently, as dCache's prefix system works on a VO-basis and there is only one VO here (ATLAS).

Let me explain.

Background

In WLCG JWT Profile specification, there is the assumption/requirement that an issuer supports exactly one VO. So token issued by https://atlas-auth.cern.ch/ is guaranteed to come from a single VO: ATLAS.

Therefore, a dCache provider (as a single gplazma.oidc.provider!XXX configuration property) has exactly one VO it supports.

From a token perspective, the iss (== issuer) claim identifies which issuer created the token. For this ATLAS server, the iss claim would be https://atlas-auth.cern.ch/, matching the issuer in the dCache provider definition.

The scope claim is another standard field that the token must have. Its where the explicit authorisation statements are stored. These look like storage.read:/dir1, where storage.read means the operation is authorising reading of data and /dir1 is the path the AuthZ statement targets. So, storage.read:/dir1 is a statement from the issuer (==ATLAS) saying the user is allowed to read files in the /dir1 directory.

Enabling the wlcg profile allows the issuer to have complete control of what files may be read, written or deleted. The dCache namespace (POSIX permissions, ACLs) are ignored when processing requests with an explicit AuthZ statement: if dCache receives a request to delete all the data along with a token says the bearer is allowed to delete all data then dCache will honour that request (no matter what is in the namespace permissions) and will delete all data.

Naturally, this is very powerful.

Particularly for multi-VO, dCache maps the authorisation path to a dCache path.

A token's scope claim might include storage.modify:/ (allowing deleting of all data), but dCache will limit this to a particular subtree; for example, tokens issued by ATLAS might be limited to a directory /pnfs/example.org/atlas. A client with an ATLAS token (with storage.modify:/ in the scope) could still delete all ATLAS data, but they would not be able to access (let alone, delete) any CMS data, because CMS data is stored in the /pnfs/example.org/cms directory.

Currently, dCache has a rather simple model for mapping VOs (as defined by the issuer URL) to the dCache namespace: it resolves the AuthZ path as if it was running within a chroot directory, as defined by the prefix parameter.

So, a token with scope claim storage.modify:/dir1 with gPlazma configured with prefix /pnfs/example.org/atlas would correspond to an authorisation to make arbitrary modifications in the dCache directory /pnfs/example.org/atlas/dir1.

Now, back to your actual problem: adding support for Tier2 support.

The fundamental problem is that ATLAS Tier 1 storage and ATLAS Tier 2 storage are really the same VO and the same issuer. Therefore gPlazma must have exactly one gplazma.oidc.provider definition (configuration property) and exactly one prefix argument.

There are a few possibilities (some depend on details that don't seem to be in the ticket).

Solution 1: rearrange the namespace

If the AuthZ paths (the paths in the scope claim items like storage.*:PATH) identify Tier-1 and Tier-2 requests then it might be possible to align dCache's namespace to match. My guess is that ATLAS doesn't do this, but it might be worth checking.

"Aligning" the namespace might be possible by renaming ("moving") directories, or by creating symbolic links, or possible some other approach.

The down side of this approach is that it requires ATLAS to have independent AuthZ paths (the /dir1 in storage.modify:/dir1) for Tier-1 and Tier-2 storage endpoints. I suspect they don't do this currently, and would resist changing this.

Moreover, it would require modifying where files are located, which would have an impact for entries stored in catalogues. Such modifications would need to be "backwards compatible".

Solution 2: run two gPlazmas

I'm assuming that you provide ATLAS with two distinct sets of doors: an ATLAS Tier-1 set of doors and an ATLAS Tier-2 set of doors. If so, you could run two gPlazma instances: an ATLAS Tier-1 gPlazma and an ATLAS Tier-2 gPlazma. These two gPlazma instances would be largely the same, but the prefix configuration for the ATLAS provider would be different; e.g.,

In the ATLAS Tier-1 gPlazma, you would have:

gplazma.oidc.provider!ATLAS = https://atlas-auth.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/data/atlas [...]

In the ATLAS Tier-2 gPlazma, you would have:

gplazma.oidc.provider!ATLAS = https://atlas-auth.cern.ch/ -profile=wlcg -prefix=/pnfs/pic.es/IFAEAtlasTier2 [...]

The ATLAS Tier-1 doors would be configured to use the ATLAS Tier-1 gPlazma, and the ATLAS Tier-2 doors would be configured to use the ATLAS Tier-2 gPlazma.

In practise, I guess you would run a new gPlazma instance for ATLAS Tier-2 (updating the ATLAS Tier-2 doors to use it) and leave all other doors using the current gPlazma service.

An important aspect is that the gplazma.oidc.audience-targets configuration of the two gPlazma instances should be distinct. This is to prevent a client from using a token issued for use against the Tier-2 endpoint against a Tier-1 endpoint (ore vice versa).

The down side to this approach is (obviously) increased complexity and from having to run multiple gPlazma instances.

Solution 3: make dCache more flexible

Currently dCache as a very simple model for mapping AuthZ paths (e.g., storage.modify:/dir1 --> /pnfs/pic.es/data/atlas/dir1). It always adding the same prefix (e.g., /pnfs/pic.es/data/atlas) for a VO.

In principal, dCache could be more sophisticated.

One (very simple) option would be to accept multiple prefix paths. A token would be authorised for each prefix.

This would be simple to implement, but would actually be rather bad, as it wouldn't separate Tier-1 and Tier-2 tokens: a token issued that should allow modification on Tier-2 storage would also allow modification on Tier-1 storage.

Another approach would be for the oidc plugin to (somehow) know whether a request is "a Tier-1 request" or "a Tier-2 request", It would then be able to apply different mappings.

One way of doing this would be to use the audience (aud) claim to control the prefix path. The token should have an aud claim (with a single value). The aud claim could be different for tokens issued for Tier-1 requests and tokens issued for Tier-2 requests.

However, supporting this additional sophistication would require some development effort. This option isn't available right now.

Summary

What I think would work "right now" would be to run a separate HTTP/WebDAV door for ATLAS Tier-2 traffic (if you don't already do this), and run another Tier-2 gPlazma instance (with a distinct prefix and supporting a different aud claim) and configure the ATLAS Tier-2 HTTP/WebDAV door to use this new gPlazma instance.

Meanwhile, we can investigate updating gPlazma/OIDC to make it more flexible, to support this use-case without requiring sites to run a separate gPlazma instance.

HTH, Paul.

elenamplanas commented 2 months ago

Thanks for you response.

We've created a new gPlazma for the Tier-2 doors, that were already separated from Tier-1.

Cheers, Elena

paulmillar commented 2 months ago

Hi @elenamplanas,

Good to know you have separate Tier-2 doors, and the quick work-around should work.

Do you happen to know if ATLAS use a different audience claim (aud) in tokens for the Tier-1 and Tier-2?

If ATLAS does that (or they could be persuaded to do that) then there's a relatively easy enhancement (to the oidc plugin) that would allow you to go back to having a single gPlazma instance.

Cheers, Paul.

elenamplanas commented 2 months ago

Hi,

For the moment they share some audience claims.

Cheers, Elena