jupyterlab / frontends-team-compass

A repository for team interaction, syncing, and handling meeting notes across the JupyterLab ecosystem.
https://jupyterlab-team-compass.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
59 stars 30 forks source link

Identity API #11

Open saulshanabrook opened 5 years ago

saulshanabrook commented 5 years ago

For the JupyterLab commenting work we need a way to identify who is commenting, to show their name and photo (https://github.com/jupyterlab/jupyterlab-commenting/issues/22, https://github.com/jupyterlab/jupyterlab-commenting/issues/35).

We were thinking about using JupyterHub to let us know who is active, but we don't want commenting to depend directly on that.

So I propose that we create a repo jupyterlab-identity to expose a global identity API in JupyterLab.

Design Notes

We have also been working on a metadata service for JupyterLab, so we thought we could have the identity API only take care of giving us a unique ID for who you are, then look up information about you, like your name and photo, with the metadata provider. Here is a sample class we could expose that does this, using properties . of the Schema.org Person type

import {LinkedDataRegistry} from '@jupyterlab/jupyterlab-metadata'

class Identity {
    constructor(private linkedDataRegistry: LinkedDataRegistry) {}

    /**
     * The current user ID.
     */
    public id: URL | null = null;

    /**
     * Get metadata about a person, retrieved from the metadata registry.
     */
    async getPerson(id: URL): Promise<{name?: string, image?: URL}> {
        const person = this.linkedDataRegistry.get(id);
        const name = person['http://schema.org/name'] 
        const image = person['http://schema.org/image']
        return {
            name: name|| undefined,
            image: image ? new URL(image): undefined,
        }
    }

}

To create a plugin with JupyterHub, we could have it set the id to something like juptyerhub:///saul when it starts up, and register metadata about the different IDs in the metadata store, if we can fetch information about them.

cc @ellisonbg @Zsailer @ktaletsk @hoo761

ellisonbg commented 5 years ago

Thanks Saul for starting this. Other points:

Cheers, Brian

On Fri, Aug 23, 2019 at 9:45 AM Saul Shanabrook notifications@github.com wrote:

For the JupyterLab commenting work we need a way to identify who is commenting, to show their name and photo ( jupyterlab/jupyterlab-commenting#22 https://github.com/jupyterlab/jupyterlab-commenting/issues/22, jupyterlab/jupyterlab-commenting#35 https://github.com/jupyterlab/jupyterlab-commenting/issues/35).

We were thinking about using JupyterHub to let us know who is active, but we don't want commenting to depend directly on that.

So I propose that we create a repo jupyterlab-identity to expose a global identity API in JupyterLab. Design Notes

We have also been working on a metadata service for JupyterLab https://github.com/jupyterlab/jupyterlab-metadata-service, so we thought we could have the identity API only take care of giving us a unique ID for who you are, then look up information about you, like your name and photo, with the metadata provider. Here is a sample class we could expose that does this, using properties . of the Schema.org Person type https://schema.org/Person

import {LinkedDataRegistry} from '@jupyterlab/jupyterlab-metadata'

class Identity { constructor(private linkedDataRegistry: LinkedDataRegistry) {}

/**     * The current user ID.     */
public id: URL | null = null;

/**     * Get metadata about a person, retrieved from the metadata registry.     */
async getPerson(id: URL): Promise<{name?: string, image?: URL}> {
    const person = this.linkedDataRegistry.get(id);
    const name = person['http://schema.org/name']
    const image = person['http://schema.org/image']
    return {
        name: name|| undefined,
        image: image ? new URL(image): undefined,
    }
}

}

To create a plugin with JupyterHub, we could have it set the id to something like juptyerhub:///saul when it starts up, and register metadata about the different IDs in the metadata store, if we can fetch information about them.

cc @ellisonbg https://github.com/ellisonbg @Zsailer https://github.com/Zsailer @ktaletsk https://github.com/ktaletsk @hoo761 https://github.com/hoo761

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jupyterlab/team-compass/issues/11?email_source=notifications&email_token=AAAGXUBM7C5T2XWZMBTFTP3QGAH3RA5CNFSM4IPBO2P2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HHCW3WQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAGXUEMPHIGIGBJOJ6TQ53QGAH3RANCNFSM4IPBO2PQ .

-- Brian E. Granger

Principal Technical Program Manager, AWS AI Platform (brgrange@amazon.com) On Leave - Professor of Physics and Data Science, Cal Poly @ellisonbg on GitHub

mckev-amazon commented 5 years ago

Very cool, thanks for taking this on!

This will definitely be a useful addition to JupyterLab. One thought I have is that you might want to look at identity standards in addition to or instead of of schema.org for inspiration. schema.org is great for data modeling purposes, but perhaps not necessarily for identity and authentication purposes (for example, it's unlikely that it would ever make sense to include "netWorth", but might make sense to include an OpenID subscriber ID in this API).

Two prevailing identity schemas are OpenID Connect (OIDC) ID Token and SAML Response Assertion.

saulshanabrook commented 5 years ago

Hey @mckev-amazon thank you for the suggestions!

We took a look at OpenID Connect, but I think this is actually a slightly different scope. I could see an OpenID Connect extension in JupyterLab that depends on this plugin and sets the identity of the user based on how they authenticate. So the hope here is to intentionally punt on how JupyterLab gets to know about who is using it. Instead, we just focus on being able to share this identity, once it has been established by some other plugin, with other parts of the application who want to know who is using JupyterLab.

mckev-amazon commented 5 years ago

Hey @saulshanabrook !

Yes, completely agreed, I would definitely try to keep this API as neutral as possible from the type of authentication or identity that is being used. Simply suggesting that we look at these standards to determine what the lowest common denominator should be for this API, given that it's a generic one. After all, at some point the internal JupyterLab API might need to be traced back to one of the external identities. The open standards might also give some hints on what to include which might not be immediately apparent (Full Name, email, and avatar all make sense to me from a UI standpoint, for example). No worries if that investigation doesn't turn up much :-)

saulshanabrook commented 5 years ago

That's a good point! Yeah thanks for these references. 👍

After all, at some point the internal JupyterLab API might need to be traced back to one of the external identities.

If you have any use cases/ideas around this, I would love to hear them. I am not that familiar with these identity schemas.

jaipreet-s commented 5 years ago

Hi @saulshanabrook , this is great! This would be useful for jupyterlab/pull-requests and jupyterlab-git to represent the identify of the person from the perspective of the VCS provider (GitHub, BitBucket, GitLab..)

A valid goal to further adoption would be for the Person interface to be extensible to allow adding of additional fields to the base types. This could also be accomplished with generics <T extends Person>

Something like:


class Person {
  private String id;
  private String name

}

class CompanyPerson extends Person {
  private String jobTitle;
  private String division;
}

From the UI perspective, the JupyterLab status bar could be utilized to display the currently determined identity of the user. Prior art here is VSCode's status bar which displays the GitHub user https://github.com/microsoft/vscode-pull-request-github/blob/master/.readme/demo.gif

ellisonbg commented 5 years ago

The reason we are thinking about adopting the schema.org person object is that it is flexible enough to handle all the complexities of people, so already has all of those things:

https://schema.org/Person

(note, this also inherits from Thing, which has a base set of attributes)

ellisonbg commented 5 years ago

To address @mckev-amazon 's comments on OIDC and SAML. Our expectation is that people deploying Jupyter will use a range of different auth/identity providers, and certainly many of them will be using OIDC/SAML in a manner that would work for identity. At the same time, we know of other deployments that are using more novel systems for identity. For example, LSST is using GitHub orgs/teams as their directory service and identity provider (and even map Github teams to local POSIX ACLs). Also, there are a range of different deployment targets, from JupyterHub, standalone single servers. I think the approach we are thinking of here will enable all of those usage cases, providers to get identity information into lab in a flexible way. That being said, I think it is reasonable for JupyterHub to standardize on OIDC or SAML from a protocol perspective to get this information, but those would be implementations of the more abstract interface.

echarles commented 5 years ago

What about IAM API instead of Identity API?

With Identity and Access Management we cover more and the Access (a.k.a. Authorization based on e.g. Roles) will come in the picture very soon. Access is highly coupled to Identity, so I believe it makes sense to look at them at the same time and the same place.

ellisonbg commented 5 years ago

The difficulty with this is that Jupyter deployments are extremely diverse on the auth side of things that I think standardizing will be impossible. Also, JupyterLab doesn't need to know details of how someone is authorized - it only needs to know who they are once they arrive. But I do agree that access and identity is coupled, but the goal here is to only surface the piece (identity) that JupyterLab needs to know about. In practice, it will be auth-systems that pass the identity information to JupyterLab.

echarles commented 5 years ago

@ellisonbg If needed, Authorization can be added after in the same repo or in a separate one. BTW a Pluggable user token creation/validation for jupyter_server could use this Identity API.