DataBiosphere / data-store

AWS and GCP data storage system for genomic data.
https://dss.dev.ucsc-cgp-redwood.org
Other
3 stars 2 forks source link

Security Expansions #167

Open amarjandu opened 4 years ago

amarjandu commented 4 years ago

excuse any typos, this macbook keyboard is having phantom key issues. For Questions please reach out to @amarjandu or @chmreid

Flexable Auth

The flexible auth service allows explicit management and configuration between the AuthN and AuthZ services during dss-api calls. The dss/util/security.py file contains the functionality for verification around the JSON Web Token (JWT) that is used to Authenticate users. The config/environment values that are used here are provided by the data-store-auth repo during deployment; but can also be manually configured to work with other authN services.

For additional AuthZ checks on API endpoints using the method wrapper dss/util/security.assert_security will allow the operator to specify additional endpoint security checks by using the derived classes from dss/util/auth/Authorize.py

Current Status

CLI

There were improvements done around the interactions that the swagger has with the data-store-cli repo, in the past there had to be manual changes to the dbio/util to update fields if they change. The swagger page has been update to provide a securityDefinition.0authSecurity field which the cli tool will reference and use for performing AuthN/AuthZ operations.

DSA

See the auth infra setup at data-store- auth (DSA). This repo is meant to stream-line the processes of setting up a AuthN/AuthZ data service. As well as take some of the guesswork away for how the auth-service is interacting with the data-store.

This provides the tenant with rules to populate JWT with the correct authN groups that can be used by the DSS, if a service account is being used this information is being populated within the cli too itself.

FLAC

There was additional work done around FLACs, in the past Fusillade was being use to provide additional services around doing access control. In effort to make the DSS an easier to deploy service, the FLAC look-ups were internalized to a dynamo-db table, which could provide additional lookup information around a given UUID. There is tooling available for the FLAC table under scripts/dss-ops.py flac, so that an operator can perform CRUD operations.

Improvements

Proxy Service

When the DSA was being setup, the use of an OIDC proxy for interactions between Data-Store -> OIDC/AUth0 was removed . At the time the authentication for the DSS could be been complete without the need of a proxy service. However there are some reasons to keep this proxy service; currently getting a JWT token to use with the api is easiest to be done with the data-store-cli tooling, if trying to use the REST API, there is no callback URI that the DSS hosts to display the JWT back to the user, nor is there something built into auth0-application that can provide this functionality. For this reason dbio dss login --remote functionality does not work, there needs to be a callback URL setup that can echo a request sent to it, this also means that the only way to correctly get a signed JWT from the DSS is by using the CLI tooling.

JWT + Service Accounts + FLACS

There was interaction that we overlooked which was the use of service accounts in the DSS and getting them to interact with the FLAC table, unfortunately there is no clean way to get the service accounts registered as users in the Auth0-AuthZ backend; Historically the DSS has used Google Service Accounts for authentication of machines (testing, ingestion services, analysis services, etc). Within the CLI the service-account group is added to the auth0-claim in the JWT to inform the DSS of if a service-account is being used. Google Service Accounts appear to be an artifact from an even older Google Based authentication system that was used in the data-store, it might be that we could use a Aut0 machine-to-machine application to get around this, but additional research should be done.

There is another implication for that however, a service account JWT could be modified to an additional group that is not actually attached to the account, for this reason if Auth0 is kept as AuthZ backend, there should be an additional verification to the userinfo auth endpoint for verification of the users. (but service-accounts are still not fully addressed even with this addition)

Fusillade

Fusillade provides both an OIDC proxy, as well as a more hardened/flexible ACL system that can be used for DSS-AUTH. At the time there was little documentation around the process of setting up a proxy system with the Auth Tenant, but with the DSA there is more documented flow to how the internals are working. Configuring the Fus-OIDC-Proxy should be more straight forward, and perhaps some of the DSA tenant configuration files should be merged into the FUS repo to make setup easier for an operator.

At the time FUS 2.0 API interacting with dbio auth // hca auth is a little broken, there would need to be fixes to the FUS-API to make this more clean for an operator to perform CRUD operations on resources within the FUS.

FUS 2.0 also allows operators to create whatever user they want, so service accounts are not restricted or have to take alternative security evaluation paths.