DIRAC Configuration structure to interact with Identity Providers

aldbr commented 1 year ago

Following our discussion in https://github.com/DIRACGrid/DIRAC/pull/6882#discussion_r1134289950. The current configuration related to Identity Providers is not really clear and needs to be revised a bit.

Here is an example of the current configuration:

Registry
{
  VO
  {
    biomed
    {
      IdProvider = CheckIn_Biomed
    }
    dteam
    {
      IdProvider = WLCG_IAM_Dteam
    }  
  }
  Groups
  {
    biomed_user
    {
      # Implicitly get the IdProvider group based on the DIRAC group name 
    }
    biomed_pilot
    {
      # Implicitly get the IdProvider group based on the DIRAC group name 
    }  
    dteam_user
    {
       # Can be explicit with IdPRole too
       IdPRole = wlcg.groups:/dteam/user
    }
  }
}
Resources
{
  IdProviders
  {
    CheckIn_Biomed
    {
      issuer = ...
      client_id = ...
      client_secret = ...
    }
    WLCG_IAM_Dteam
    {
      issuer = ...
      client_id = ...
      client_secret = ...
    }
  }
}

Problems:

If IdPRole is left empty, it is assumed that the group name is formed such as: <group>_<role>, which is not a rule. We don't want that because it is too constraining.
IdProviders section is misleading: subsections are actually OAuth2 clients bound to an Identity Provider.
No possibility to define an OAuth2 client per group (only per VO).

Here is the suggested configuration (to be further discussed and/or implemented):

Registry
{
  VO
  {
    biomed
    {
      IdProvider = CheckIn
      IdPClient = Biomed
    }
    dteam
    {
      IdProvider = WLCG_IAM
      IdPClient = Dteam
    }  
  }
  Groups
  {
    biomed_user
    {
      # The group uses Biomed IdPClient as defined in the biomed VO section 
      # this would be explicitly mapped by OAuth2IdProvider.getUserGroups(in parseIdPMapping),
      # no coded rules or regexps are used
      IdPMapping = eduperson_entitlement:urn:mace:egi.eu:group:registry:biomed:role=member#aai.egi.eu
      # No special other scopes for this group (so far)
    }
    biomed_pilot
    {
       IdPMapping = eduperson_entitlement:urn:mace:egi.eu:group:registry:biomed:role=pilot#aai.egi.eu
       Scope = compute.read compute.write ...
       # By default groups are using the IdProvider defined for the VO. In some cases a different client of the same
       # identity provider can be used
       IdPClient = BiomedPilot 
    }  
    dteam_user
    {
       IdPMapping = wlcg.groups:/dteam/user
    }
  }
}
Resources
 {
    IdProviders
    {
        CheckIn
        {
            issuer = ...
            Grant = device, refresh_token, authorization_code 
            ...
            Clients
            {
                Biomed
                {
                    ClientID = ...
                    ClientSecret = ...
                }
                BiomedPilot
               {
                   ClientID = ...
                   ClientSecret = ...
                   # overriding grant definition of the IdProvider
                   Grant = client_credentials
               }
            } 
        } 
        WLCG_IAM
        {
            issuer = ...
            Clients
            {
                DTeam
                {
                    ...
                }
            } 
        }
    }
  }

What's new with this configuration:

The /Resources/IdProviders section is expanded: each subsection is composed of general attributes and Clients subsections. This structure better fits with OAuth2 components.
Possibility to define an OAuth2 client per VO and redefine it in a group (IdProvider should be the same in group and VO though).
The IdPRole parameter becomes IdPMapping: the field is mandatory if a group-based scope has to be used, meaning there is no possibility to guess the Identity Provider group from the DIRAC group name anymore (no constraint on the group name, the token profile is explicitly defined).
The IdPScope parameter: an attribute-based scope such as compute.read compute.write, etc. If both IdPMapping and IdPScope are present, they are merged to get a token.
Supported grant types are defined: possibility to know whether a given client supports a certain flow (could be useful to know whether we use a client or a user access tokens to submit jobs for instance).

My last comment in this discussion was about the Grant parameter:

Should we really define them in the CS? If the grant types of a client change over time, a DIRAC administrator has to update the Grant parameter. We could also get them dynamically by interacting with the Identity Provider: in this way, it would always be up to date.
If I add Grant = client_credentials in a Client used to submit pilots, does it mean that I want to use this flow to submit pilots? Or should we have another option to control whether the client_credentials flow should be employed in this context?
Other than that, do I need to know which grant types are supported by a given client?

Once we agree on a common configuration structure, I can create a PR implementing it.

aldbr commented 1 year ago

More comments after discussing with @chaen:

Possibility to define an OAuth2 client per VO and redefine it in a group (IdProvider should be the same in group and VO though).

Do we actually need this flexibility? One powerful client for a VO could be enough. Why would a VO need to have multiple clients?

If we don't need this flexibility, then do we need to have a Resources structure such as: IdProviders/Clients? I would say yes in the case of CheckIn, since multiple VOs would share a common OIDC IdProvider, but the only common attribute would be the issuer. If we say no, a simple /Resources/IdProviderClient section could be enough.

The IdPScope parameter: an attribute-based scope such as compute.read compute.write, etc. If both IdPMapping and IdPScope are present, they are merged to get a token.

With IdPMapping and IdPScope, we separate the group-based capabilities from the scope-based capabilities. Do we actually need to separate them since they will be concatenated to be used?

I would say yes because group-based capabilities are expected to generate new claims in the payload (a wlcg.groups scope would generate a wlcg.groups containing all the groups the user belongs to).

Let's take an example:

A user comes with an access token with goup-based capabilities (wlcg.groups:/dteam/user). We need to check the wlcg.groups claim and compare it with IdPMapping.

A user comes with an access token with scope-based capabilities (compute.read). We need to check the scope claim and compare it with IdPScope.

IdPScope could theoretically contain a long string of parametric scopes (mostly related to storage: storage.read:/dteam/admin storage.read:/dteam/user storage.write:...). What about making IdPScope a list to have something more readable?

IdPScope = compute.read
IdPScope += compute.write
IdPScope += storage.read:/dteam/user
IdPScope += storage.read:/dteam/admin

Another suggestion could be:

IdPScope {
  compute.read = None
  compute.write = None
  storage.read = /dteam/admin
  storage.read += /dteam/user
}

I have the feeling that the first proposition would be easier to manipulate within the code than the second one. Any opinion?

atsareg commented 1 year ago

As for the Grant option of the Client. I think this should be interpreted as "which grant to use" and not that these grants are allowed in general. It can be rather renamed to UseGrant to be more explicit.

Concerning multiple clients per IdProvider. If even technically it should be possible to use just one powerful client with all grants, privileges and scope of all the communities, I do not think it will be acceptable for the multi-VO environment. For example, I would be reluctant to use the client for getting pilot tokens with client_credentials to be used also to refresh user tokens. Also, communities might define custom scopes that are clearly not to be shared across multiple communities.

atsareg commented 1 year ago

As for the group based and scope based capabilities. I think we have to distinguish the two in our configuration as proposed. There are two distinct cases. For authentication of users accessing DIRAC services we will use groups/entitlements only for mapping onto DIRAC groups and mixing them with scopes can result in confusion, so better use a dedicated option to be used in using of this mapping. When getting user tokens extra scopes can be added, e.g. for pilots or for tokens to access tokens, here mixing is quite natural.

The first proposal for defining IdPScopes as a list of values is fine. In the code scopes are passed as lists/sets anyway and then concatenated before adding them to the OAuth flows.

chaen commented 1 year ago

Concerning multiple clients per IdProvider. If even technically it should be possible to use just one powerful client with all grants, privileges and scope of all the communities, I do not think it will be acceptable for the multi-VO environment. For example, I would be reluctant to use the client for getting pilot tokens with client_credentials to be used also to refresh user tokens. Also, communities might define custom scopes that are clearly not to be shared across multiple communities.

Why is it a problem ?

As for the group based and scope based capabilities. I think we have to distinguish the two in our configuration as proposed. There are two distinct cases. For authentication of users accessing DIRAC services we will use groups/entitlements only for mapping onto DIRAC groups and mixing them with scopes can result in confusion, so better use a dedicated option to be used in using of this mapping. When getting user tokens extra scopes can be added, e.g. for pilots or for tokens to access tokens, here mixing is quite natural.

I am not sure I understand what you are saying. In particular this sentence "For authentication of users accessing DIRAC services we will use groups/entitlements only for mapping onto DIRAC groups and mixing them with scopes can result in confusion". Are you saying that the DIRAC services should rely on the group contained in the token ?

Anyway, I feel like we are again discussing at length technical solutions/organization aspects even before knowing really the requirements. Can you please extensively fill in https://github.com/DIRACGrid/DIRAC/issues/6894 ? This, together with a similar list from LHCb (and others interested) should be the starting point. Could you have it before the end of March, such that have time to digest it before the meeting ?

aldbr commented 1 year ago

More discussions of the configuration structure to interact with Identity Providers here: https://github.com/DIRACGrid/DIRAC/pull/6950#issuecomment-1549154351

aldbr commented 12 months ago

Should we close this issue? I think it does not make sense anymore.

fstagni commented 11 months ago

Probably...

DIRACGrid / DIRAC

DIRAC Configuration structure to interact with Identity Providers #6917