ssl-hep / ServiceX

ServiceX - a data delivery service pilot for IRIS-HEP DOMA
BSD 3-Clause "New" or "Revised" License
19 stars 21 forks source link

Token-based Authentication to ServiceX #492

Open BenGalewsky opened 1 year ago

BenGalewsky commented 1 year ago

As a user of the Coffea-Casa environment I want to be able to authenticate to ServiceX automatically so I can use the service with no additional setup

Frontend Changes

In the current version of Coffea-Casa, users receive a token in their environment that is issued from the CMS issuer.

  1. The ServiceX client should search for the presence of a bearer token utilizing the WLCG bearer token discovery protocol.

    1. Particularly, Coffea-Casa uses the BEARER_TOKEN_FILE environment variable to point at the relevant token.
    2. Optional: Have a configuration file entry that allows the user to override the filename.
    3. The token should not go into a configuration file.
  2. If a token is discovered in step (1), then the token should be provided in the HTTP authorization to ServiceX.

    1. The Authorization header in the HTTP request will be of the form: Authorization: Bearer $BEARER_TOKEN_VALUE

Web Service Changes

On the server side, if a JWT is present in the Authorization header, then:

  1. Use a JWT library (https://pyjwt.readthedocs.io/en/stable/ is a reasonable example but not the only option) to deserialize the token.
    1. Read out the issuer claim and ensure the issuer is one accepted by the ServiceX instance. I’d recommend only allowing a single issuer per instance.
    2. If multiple JWT logic paths exist, use this issuer name to identify that the logic below should be executed.
  2. Verify the token itself. Download (and cache) the public keys from the issuer using OpenID metadata discovery (you can either use this library directly or simply copy the logic; using the library means there’s less copy/pasting of logic).
    1. Ensure that a given scope is present in the token permitting the use of ServiceX. You are permitted to make up your own scope name; I suggest servicex (all lower case; checks are case sensitive).
  3. From the token, the “sub” should be used as the identity. If the identity does not already exist, a new user should automatically be inserted into the backend database.
    1. If multiple issuers are supported, the user created should actually be the tuple of (issuer, sub). This way, if two issuers have the same username one can distinguish the two.

Assumptions

  1. The existing auth mechanisms are retained and available as feature flags
  2. ServiceX frontend will treat the existing token as optional
BenGalewsky commented 1 year ago

It looks like we can't get a secret signing key for CERN JWTs - this is the preferred solution to use this library and get the public key https://github.com/scitokens/scitokens/blob/master/src/scitokens/utils/keycache.py#L224-L313