ga4gh / ga4gh-schemas

Models and APIs for Genomic data. RETIRED 2018-01-24
http://ga4gh.org
Apache License 2.0
214 stars 114 forks source link

Auth #148

Closed skeenan closed 9 years ago

skeenan commented 9 years ago

Proposed topic for ASHG So far we have been ignoring Auth. But we all know that in the real world a large fraction of. Are there parts of an auth model that should be standardised - or not - and how would we do that.

mlin commented 9 years ago

+1 for Auth discussion in San Diego. From where we stand, authentication and access control schemes seem to be prerequisites for truly useful applications of the GA4GH API. (Note authentication and access control as related but distinct topics.) That stated, a light pass just on authentication (e.g., clients pass a certain environment variable, if present, into an Authorization HTTP request header) might allow some forward progress without trying to solve all problems immediately.

richarddurbin commented 9 years ago

+1 for auth/access control

Getting a real working auth/access control into our early pilot code might be a big win. Current BAM/VCF htslib code has none of this integrated. An interesting use case for relatively low bandwidth API is for clinical genomics review of potential disease pathogenic variants, where the alignment data (read store/BAM) might be in a central store, and the clinical decision support review might want to pull up information about one or two very localised regions per patient. I think it would promote buy-in for a GA4GH API-based application to have a standard solution with GA4GH ethics group backing which supports good access control to relevant subsets of the data.

On 23 Sep 2014, at 10:03, Mike Lin notifications@github.com wrote:

+1 for Auth discussion in San Diego. From where we stand, authentication and access control schemes seem to be prerequisites for truly useful applications of the GA4GH API. (Note authentication and access control as related but distinct topics.) That stated, a light pass just on authentication (e.g., clients pass a certain environment variable, if present, into an Authorization HTTP request header) might allow some forward progress without trying to solve all problems immediately.

— Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

tetron commented 9 years ago

Authentication should probably be considered a implementation issue, not a schema one. There are many existing authentication frameworks (such as OpenID/OAuth on the web, and LDAP/ActiveDirectory in the enterprise) and different organizations will have different security requirements, so picking a single approach may actually be harmful to interoperability.

On the other hand, supporting fine grained access control at the individual object level (as opposed to the simply granting ability to read the entire data store) needs to be based on the actual semantics of the data. For example, if a user can read a GAReadGroup, does that imply that they can read all the reads in that group? In POSIX, the ability to read a directory does not confer the ability to read all the files in that directory, since the permissions on each file is set independently. This often impedes sharing between users. On the other hand, the approach taken by Arvados provides transitive permissions: if X can read directory Y and directory Y contains item Z, then X can read Z. In Richard Durbin's example, one may want grant permission only to read certain regions, where the definition of a region for access control purposes is very tied to the schema semantics. There is potential interaction with the reproducibility issue discussed in #142, since access control policies will (by design) cause different users to get back different responses, even if the underlying data has not changed.

max-biodatomics commented 9 years ago

There are many existing authentication frameworks (such as OpenID/OAuth on the web, and LDAP/ActiveDirectory in the enterprise) and different organizations will have different security requirements, so picking a single approach may actually be harmful to interoperability.

Still there are integrative platforms which can be connected to OpenID, LDAP and others and expose a single API externally.

cassiedoll commented 9 years ago

@delagoya is going to come up with a first proposal to resolve this issue.

jeromekelleher commented 9 years ago

After the discussion on the reference variation call yesterday, I've opened ga4gh/server#31 to discuss the practical requirements for OAuth support in the reference implementation. I would ask anyone interested (particularly and especially those on the Security working group) to please comment on ga4gh/server#31, so that we can gather the requirements for this implementation into a central location.

delagoya commented 9 years ago

Closing in favor of Oauth discussions and implementation from ga4gh/server#31 Authentication will be resource specific and not a part of the API at this point. If someone disagrees, please reopen the issue with a specific example as a new issue/PR