Open saulshanabrook opened 5 years ago
Thanks Saul for starting this. Other points:
Cheers, Brian
On Fri, Aug 23, 2019 at 9:45 AM Saul Shanabrook notifications@github.com wrote:
For the JupyterLab commenting work we need a way to identify who is commenting, to show their name and photo ( jupyterlab/jupyterlab-commenting#22 https://github.com/jupyterlab/jupyterlab-commenting/issues/22, jupyterlab/jupyterlab-commenting#35 https://github.com/jupyterlab/jupyterlab-commenting/issues/35).
We were thinking about using JupyterHub to let us know who is active, but we don't want commenting to depend directly on that.
So I propose that we create a repo jupyterlab-identity to expose a global identity API in JupyterLab. Design Notes
We have also been working on a metadata service for JupyterLab https://github.com/jupyterlab/jupyterlab-metadata-service, so we thought we could have the identity API only take care of giving us a unique ID for who you are, then look up information about you, like your name and photo, with the metadata provider. Here is a sample class we could expose that does this, using properties . of the Schema.org Person type https://schema.org/Person
import {LinkedDataRegistry} from '@jupyterlab/jupyterlab-metadata'
class Identity { constructor(private linkedDataRegistry: LinkedDataRegistry) {}
/** * The current user ID. */ public id: URL | null = null; /** * Get metadata about a person, retrieved from the metadata registry. */ async getPerson(id: URL): Promise<{name?: string, image?: URL}> { const person = this.linkedDataRegistry.get(id); const name = person['http://schema.org/name'] const image = person['http://schema.org/image'] return { name: name|| undefined, image: image ? new URL(image): undefined, } }
}
To create a plugin with JupyterHub, we could have it set the id to something like juptyerhub:///saul when it starts up, and register metadata about the different IDs in the metadata store, if we can fetch information about them.
cc @ellisonbg https://github.com/ellisonbg @Zsailer https://github.com/Zsailer @ktaletsk https://github.com/ktaletsk @hoo761 https://github.com/hoo761
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jupyterlab/team-compass/issues/11?email_source=notifications&email_token=AAAGXUBM7C5T2XWZMBTFTP3QGAH3RA5CNFSM4IPBO2P2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HHCW3WQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAGXUEMPHIGIGBJOJ6TQ53QGAH3RANCNFSM4IPBO2PQ .
-- Brian E. Granger
Principal Technical Program Manager, AWS AI Platform (brgrange@amazon.com) On Leave - Professor of Physics and Data Science, Cal Poly @ellisonbg on GitHub
Very cool, thanks for taking this on!
This will definitely be a useful addition to JupyterLab. One thought I have is that you might want to look at identity standards in addition to or instead of of schema.org for inspiration. schema.org is great for data modeling purposes, but perhaps not necessarily for identity and authentication purposes (for example, it's unlikely that it would ever make sense to include "netWorth", but might make sense to include an OpenID subscriber ID in this API).
Two prevailing identity schemas are OpenID Connect (OIDC) ID Token and SAML Response Assertion.
Hey @mckev-amazon thank you for the suggestions!
We took a look at OpenID Connect, but I think this is actually a slightly different scope. I could see an OpenID Connect extension in JupyterLab that depends on this plugin and sets the identity of the user based on how they authenticate. So the hope here is to intentionally punt on how JupyterLab gets to know about who is using it. Instead, we just focus on being able to share this identity, once it has been established by some other plugin, with other parts of the application who want to know who is using JupyterLab.
Hey @saulshanabrook !
Yes, completely agreed, I would definitely try to keep this API as neutral as possible from the type of authentication or identity that is being used. Simply suggesting that we look at these standards to determine what the lowest common denominator should be for this API, given that it's a generic one. After all, at some point the internal JupyterLab API might need to be traced back to one of the external identities. The open standards might also give some hints on what to include which might not be immediately apparent (Full Name, email, and avatar all make sense to me from a UI standpoint, for example). No worries if that investigation doesn't turn up much :-)
That's a good point! Yeah thanks for these references. 👍
After all, at some point the internal JupyterLab API might need to be traced back to one of the external identities.
If you have any use cases/ideas around this, I would love to hear them. I am not that familiar with these identity schemas.
Hi @saulshanabrook , this is great!
This would be useful for jupyterlab/pull-requests
and jupyterlab-git
to represent the identify of the person from the perspective of the VCS provider (GitHub, BitBucket, GitLab..)
A valid goal to further adoption would be for the Person
interface to be extensible to allow adding of additional fields to the base types. This could also be accomplished with generics <T extends Person>
Something like:
class Person {
private String id;
private String name
}
class CompanyPerson extends Person {
private String jobTitle;
private String division;
}
From the UI perspective, the JupyterLab status bar could be utilized to display the currently determined identity of the user. Prior art here is VSCode's status bar which displays the GitHub user https://github.com/microsoft/vscode-pull-request-github/blob/master/.readme/demo.gif
The reason we are thinking about adopting the schema.org person object is that it is flexible enough to handle all the complexities of people, so already has all of those things:
(note, this also inherits from Thing, which has a base set of attributes)
To address @mckev-amazon 's comments on OIDC and SAML. Our expectation is that people deploying Jupyter will use a range of different auth/identity providers, and certainly many of them will be using OIDC/SAML in a manner that would work for identity. At the same time, we know of other deployments that are using more novel systems for identity. For example, LSST is using GitHub orgs/teams as their directory service and identity provider (and even map Github teams to local POSIX ACLs). Also, there are a range of different deployment targets, from JupyterHub, standalone single servers. I think the approach we are thinking of here will enable all of those usage cases, providers to get identity information into lab in a flexible way. That being said, I think it is reasonable for JupyterHub to standardize on OIDC or SAML from a protocol perspective to get this information, but those would be implementations of the more abstract interface.
What about IAM API
instead of Identity API
?
With Identity and Access Management
we cover more and the Access (a.k.a. Authorization based on e.g. Roles) will come in the picture very soon. Access is highly coupled to Identity, so I believe it makes sense to look at them at the same time and the same place.
The difficulty with this is that Jupyter deployments are extremely diverse on the auth side of things that I think standardizing will be impossible. Also, JupyterLab doesn't need to know details of how someone is authorized - it only needs to know who they are once they arrive. But I do agree that access and identity is coupled, but the goal here is to only surface the piece (identity) that JupyterLab needs to know about. In practice, it will be auth-systems that pass the identity information to JupyterLab.
@ellisonbg If needed, Authorization can be added after in the same repo or in a separate one. BTW a Pluggable user token creation/validation for jupyter_server could use this Identity API.
For the JupyterLab commenting work we need a way to identify who is commenting, to show their name and photo (https://github.com/jupyterlab/jupyterlab-commenting/issues/22, https://github.com/jupyterlab/jupyterlab-commenting/issues/35).
We were thinking about using JupyterHub to let us know who is active, but we don't want commenting to depend directly on that.
So I propose that we create a repo
jupyterlab-identity
to expose a global identity API in JupyterLab.Design Notes
We have also been working on a metadata service for JupyterLab, so we thought we could have the identity API only take care of giving us a unique ID for who you are, then look up information about you, like your name and photo, with the metadata provider. Here is a sample class we could expose that does this, using properties . of the Schema.org
Person
typeTo create a plugin with JupyterHub, we could have it set the
id
to something likejuptyerhub:///saul
when it starts up, and register metadata about the different IDs in the metadata store, if we can fetch information about them.cc @ellisonbg @Zsailer @ktaletsk @hoo761