Closed benjeffery closed 5 years ago
A strawman implementation:
Per table:
authColumn
config directive that specifies the name of a column that will be used for auth, defaulting to empty which means no auth.authPublicValue
config directive. If in the value of authColumn
for a given row is authPublicValue
then that row is public. Defaults to the string public
.Per dataset:
authMapping
config directive. The keys are SSO group names or the string public
, the values are lists of strings or the string 'all'. Defaults to empty object.When a query request comes in, the users' SSO groups are looked up in authMapping
. This is reduced to a set of unique strings allowedRows
, unless any of the entries is the string 'all', in which case allowedRows
becomes the string 'all'. The query returns all rows where authColumn
is in allowedRows
, or authColumn
is 'public' or all rows if allowedRows
is 'all'.
I gather no authColumn
and hence no auth
mean that all the data for that table will be publicly viewable (or otherwise not under any authorisation, although it might be behind authentication).
I also gather that authPublicValue
is the mechanism by which an implementer can specify which rows in a particular table (I gather the directives are per-table directives) are not under authorisation (are public, or universally visible/accessible). In this way, we can override (or nullify) other auth restrictions.
So this is per-row granularity.
It would be interesting to know why this is more efficient or more practical than having separate tables with separate auth settings, rather than having mixed auth in the same table.
I gather that the scheme for the authMapping
directive, on the dataset level, allows us to compose multiple SSO auth groups into one Panoptes auth group as a convenience / higher abstraction, instead of listing the SSO auth groups in the data, for example, which would be harder to modify, etc.
It's not clear to me yet what an authMapping
of 1) the default empty object means, 2) what else is in the object apart from the SSO groups as keys, 3) what the all
keyword means (I gather it means all SSO groups
but the implications of that are not clear, unless the meaning of the SSO groups are clear, e.g. i) are they mutually exclusive or do they overlap, ii) does it make sense to say this row of data can be seen by a user with all the SSO group permissions
or this row of data can be seen by a user with any of the SSO group permissions
, and how that might contrast with the all
keyword.
I gather no
authColumn
and henceno auth
mean that all the data for that table will be publicly viewable (or otherwise not under any authorisation, although it might be behind authentication).
Correct on both counts
It would be interesting to know why this is more efficient or more practical than having separate tables with separate auth settings, rather than having mixed auth in the same table.
This is so that a user can download "all rows that I have access to" in one query.
It would be helpful to my understanding (perhaps ours) to unpack that last paragraph a bit. (I'm finding it difficult to read and follow.)
authMapping
. I read that as: We have a query to process, so we need to know which SSO groups the user belongs to, and we need to compare those with the auth config of the dataset in question, to work out which data to include in the query (more importantly: which data to exclude).
allowedRows
, unless any of the entries is the string 'all', in which case allowedRows
becomes the string 'all'.... still unpacking
authColumn
is in allowedRows
, or authColumn
is 'public' or all rows if allowedRows
is 'all'.... still unpacking
I gather that the scheme for the
authMapping
directive, on the dataset level, allows us to compose multiple SSO auth groups into one Panoptes auth group as a convenience / higher abstraction, instead of listing the SSO auth groups in the data, for example, which would be harder to modify, etc.
Exactly - my expectation here is that we'll end up using the study_id
column, this will let us map SSO groups to study ids.
It would be helpful to my understanding (perhaps ours) to unpack that last paragraph a bit. (I'm finding it difficult to read and follow.)
- When a query request comes in, the users' SSO groups are looked up in
authMapping
.I read that as: We have a query to process, so we need to know which SSO groups the user belongs to, and we need to compare those with the auth config of the dataset in question, to work out which data to include in the query (more importantly: which data to exclude).
- This is reduced to a set of unique strings
allowedRows
, unless any of the entries is the string 'all', in which caseallowedRows
becomes the string 'all'.... still unpacking
- The query returns all rows where
authColumn
is inallowedRows
, orauthColumn
is 'public' or all rows ifallowedRows
is 'all'.... still unpacking
Now that I have slept and read it I find it really unclear! Here's an example:
authMapping:
SSO_badgers: study1, study3, study4
SSO_birds: study5, study1
SSO_lions: all
So a user who is just a bird
gets study5, study1 and public rows. A user who has both badger and bird gets studies 1,3,4,5, public. Lions get all.
Thanks.
It's still not clear to me what those values in the lists of strings in the authMapping
object actually are, study1
, study2
, etc. Are they values in one / any authColumn
in one / any table?
Or what allowedRows
is, and where it fits into this.
Thanks. It's still not clear to me what those values in the lists of strings in the
authMapping
object actually are,study1
,study2
, etc. Are they values in one / anyauthColumn
in one / any table?
Values in any authColumn
in any table. Could have it per-table, but I can't think of a clear use case there.
Or what
allowedRows
is, and where it fits into this. Just a local variable that contains the unique strings to look for inauthColumn
. I agree that calling it something probably confused the explanation!
Thanks. I think I see it all now.
So if you were more verbose, you might call it something like listOfAuthColumnValuesPresentOnRowsToReturnToThisUser
(not suggesting that!)
And to simplify the cases where that list is equivalent to all / any authColumnValue
(all / any rows), then we would expect the list to contain one item, the keyword all
, which is in contrast to an empty object, which might imply that no rows will be returned to the user.
And {'all', 'study1'}
might be an anomaly.
The last nuance I'm pondering is when an authColumn contains the value public
. I suppose one way of looking at the value 'public' (actually the value of the config const authPublicValue
, which might confusingly be set to all
, just to test things!) is that it renders the allowedRows
content effectively irrelevant, for that particular row.
To be precise it is listOfAuthColumnValuesPresentOnRowsToReturnToThis SSOGroup
['all', 'study1']
would return rows where authColumn
had the value all
and study1
For the last point you can imagine that authPublicValue
gets added to allowedRows
for every query.
I thought you covered the anomaly of {'all', 'study1'}
by simply equating it to {'all'}
if/when there is one or more 'all'.
Your first clarification about returning to this SSO group has confused me a bit... The user might belong to more than one SSO group, and we only care about returning things to the user. Are you saying that there is a list of allowedRows per SSO group?
You add every entry in authMapping
that has a key that matches any SSO group the user belongs to. The user can have many.
I see your point about adding authPublicValue to allowedRows. I'm not sure if you mean that's the actual mechanism, or if that's effectively (or equivalent to) what would happen, but I reckon I see the point at least.
I need to expand and clarify your last comment a bit to make sense of it and check my / our understanding.
authMapping
is an object that maps SSO groups to lists of authColumn valuesYep! Sounds good.
For a table where this is configured, when not logged in a user can only see (plot/download) rows that are marked as public. When logged in a user can see all rows that are marked as a group that they have access to, along with those that are public.