Closed dwinston closed 10 months ago
depends on #310
once authn+authz is simplified, ensure that use of ORCiD is supported (and preferred) for authentication.
I made a sequence diagram to understand the current state
Here's a new sequence diagram with the ORCID flow added in.
I'm blocked on the following:
Not sure how the ORCID flow should interact w/ the User collection--assuming we don't want anyone with an ORCID to have access to all API endpoints. How should we handle creating Users in MDB associated with ORCIDs?
Not sure get_current_user()
can support multiple OAuth schemes. We may need to create a new function get_current_user_orcid()
. If so, do we want get_current_user_orcid()
to return a User from MDB?
I was able to get ORCID's example auth code flow working using the following steps:
var return_page = "http%3A%2F%2F127.0.0.1%3A8081%2Forcid_popup.html"
url = "https://orcid.org/oauth/authorize?client_id=APP-NTSSHLDLXTCDNF4M&response_type=code&scope=/authenticate&redirect_uri=" + return_page
var authWindow;
//This opens a popup that hits the ORCID OAuth endpoint
function getORCID(){
if ($("#orcid").val()){
return;
}
authWindow = window.open(url, 'authWindow');
}
<div class="container">
<h1 class="display-3">My website!</h1>
<p>Please authenticate your ORCID ID so we can automatically register your name and associate your professional activities with your ORCID record</p>
<p><a class="btn btn-primary btn-lg" href="#" role="button" onclick="getORCID()" id="auth_button">Authenticate your ORCID</a></p>
<form>
<label for="orcid">ORCID iD</label><input id="orcid" readonly class="form-control"></input>
<label for="given_name">Given Name</label><input id="given_name" class="form-control"></input>
<label for="family_name">Family Name</label><input id="family_name" class="form-control"></input>
</form>
Upon clicking "Authorize Access" the redirect URL will have a one-time-use code (eg http://127.0.0.1:8081/orcid_popup.html?code=X4fb9a)
Use that code to run the following curl
command:
curl -i -L -H "Accept: application/json" --data "client_id=APP-NTSSHLDLXTCDNF4M&client_secret=bd626861-e85c-4f1f-b59a-80f4cd22d22d&grant_type=authorization_code&code=X4fb9a&redirect_uri=http://127.0.0.1:8081/orcid_popup.html" "https://orcid.org/oauth/token"
curl
returns an access_token and orcidI'm blocked on the following:
Not sure how the ORCID flow should interact w/ the User collection--assuming we don't want anyone with an ORCID to have access to all API endpoints. How should we handle creating Users in MDB associated with ORCIDs?
For now, make them disabled
: https://github.com/microbiomedata/nmdc-runtime/blob/4561d854e31291fab40cd7377c52eb62986e0683/nmdc_runtime/api/models/user.py#L24
and set username as the orcid ID
Not sure
get_current_user()
can support multiple OAuth schemes. We may need to create a new functionget_current_user_orcid()
. If so, do we wantget_current_user_orcid()
to return a User from MDB?
idea for now is to add a third case, for supplying an orcid jwt, to the existing two cases (password
and client_credentials
) in login_for_access_token
and OAuth2PasswordOrClientCredentialsRequestForm
. So a new grant type, renaming the latter "OAuth2PasswordOrClientCredentialsOrOrcidJWTRequestForm" (I know, a monstrosity, but just to get this to a working state).
actually, marking a user as disabled
currently won't let them even see their own profile, as get_current_active_user
is a guard for all authentication-required routes.
what I suggest we do here is replace all usages of get_current_active_user
with a new get_permitted_user
function that includes the logic of e.g.
if not permitted(user.username, "/metadata/changesheets:submit"):
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail=(
f"Only users {users_allowed('/metadata/changesheets:submit')} "
"are allowed to apply changesheets at this time."
),
)
and then add logic to ensure_default_api_perms
to ensure users without orcids as their usernames are allowed to do the things they can do now just by being an authenticated user, and don't extend this capability to orcid-usernamed users. That should get things functional, and we can refactor from there.
FYI I predict we will end up removing the client_credentials
grant-type consideration for oauth flow when we "simply auth" as a separate issue.
idea for now is to add a third case, for supplying an orcid jwt, to the existing two cases (password and client_credentials) in login_for_access_token and OAuth2PasswordOrClientCredentialsRequestForm. So a new grant type, renaming the latter "OAuth2PasswordOrClientCredentialsOrOrcidJWTRequestForm" (I know, a monstrosity, but just to get this to a working state).
I believe FastAPI's native OAuth2AuthorizationCodeBearer class is designed to handle both steps in the auth code flow (getting the code from the authorizationURL AND getting the token from the tokenUrl).
However OAuth2 does allow for an Authorization Code Grant where the user would supply their auth code NOT their JWT, to the API (in our case, enter their auth code into the Request Form).
To add this new grant type we would:
Create a new API endpoint that returns an authorization_code
from ORCID (see above)
Update login_for_access_token
in users.py
elif form_data.grant_type == "authorization_code":
# get token first, then authenticate user
access_token , orcid = get_access_token_orcid(auth_code: auth_code)
# ... skipping logic to add sub and expires_delta
access_token = update_orcid_access_token(
data={"sub": f"user:{user.username}"}, expires_delta=access_token_expires
)
user = authenticate_user_orcid(mdb, orcid)
if not user:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Incorrect username or password",
headers={"WWW-Authenticate": "Bearer"},
)
auth.js
def authenticate_user_orcid(mdb, orcid: str):
username = orcid
user = get_user(mdb, username)
if not user:
return False
return user
async def get_access_token_orcid(auth_code: auth_code):
# equivalent of curl -i -L -H "Accept: application/json" --data "client_id=APP-NTSSHLDLXTCDNF4M&client_secret=bd626861-e85c-4f1f-b59a-80f4cd22d22d&grant_type=authorization_code&code=X4fb9a&redirect_uri=http://127.0.0.1:8081/orcid_popup.html" "https://orcid.org/oauth/token"
return (access_token, orcid)
authorization_code
to OAuth2PasswordOrClientCredentialsOrOrcidAuthCodeRequestForm
def __init__(
self,
basic_creds: Optional[HTTPBasicCredentials] = Depends(basic_credentials),
grant_type: str = Form(None, regex="^password$|^client_credentials$|^authorization_code$"),
authorization_code: Optional[str] = Form(None),
# ...
if grant_type == "authorization_code" and authorization_code is None:
raise HTTPException(
status_code=HTTP_400_BAD_REQUEST,
detail="grant_type password requires username and password",
)
# ...
self.authorization_code = authorization_code
authCode
flow to OAuth2PasswordOrClientCredentialsBearer
class? This flow would still use our /token
endpoint to call login_for_access_token
NOT ORCID's "/token" endpoint. NOTE: I'm not sure how this works. Pls see blocker below. flows = OAuthFlowsModel(
password={"tokenUrl": tokenUrl, "scopes": scopes},
clientCredentials={"tokenUrl": tokenUrl},
authCode={"tokenUrl": tokenUrl}
)
I'm not sure if updating login_for_access_token
would allow us to use a single oauth2_scheme
for users who sign in with either ORCID or username/password
# auth.py
oauth2_scheme = OAuth2PasswordOrClientCredentialsBearer(tokenUrl="token")
# user.py
async def get_current_user(
token: str = Depends(oauth2_scheme)
# ...
reference discussion with fastapi maintainer's input: https://github.com/tiangolo/fastapi/discussions/9137#discussioncomment-5157382
some hacking around: https://github.com/microbiomedata/nmdc-runtime/compare/333-playground?expand=1
I was just thinking we could overload the current client_credentials flow to recognize an orcid jwt supplied as client_id and empty client_secret. So no new auth flow needed.
Just catching up here but I'd like to offer some thoughts
Username | ORCID (if present) | ACLs | Client Key (if present) | is_admin |
---|---|---|---|---|
myusername | 000-323-32323 | {coll1: "readwrite", coll2: "readonly" } | None | False |
nersc-wf | None | {coll2: "readwrite"} | dd3f38fy34fbu | False |
(This is a rough sketch - we can work out the details)
Also - we can separate out the grants / auth code stuff etc. for now, if there is a simple way to get the ORCID token out of band.
I think we should look at using the OpenID connect tokens for this, since it allows supports additional parameters like max-age and authentication date. https://github.com/ORCID/ORCID-Source/blob/main/orcid-web/ORCID_AUTH_WITH_OPENID_CONNECT.md
This was discussed at infra sync and is active. Will move to next sprint.
This ticket consists of two tasks:
openid_oauth2_scheme
here using the authorizationCode flow: https://github.com/microbiomedata/nmdc-runtime/blob/ebe41c9874ccf0326b925462afd6f8f672b6671e/nmdc_runtime/api/core/auth.py#L116 (alt: we could also use the implicit openid flow, as described in the orcid dev tools)DOCS: fastapi.security.OAuthFlows.authorizationCode