nebari-dev / nebari

🪴 Nebari - your open source data science platform
https://nebari.dev
BSD 3-Clause "New" or "Revised" License
279 stars 91 forks source link

[ENH] - Filter all conda environments with given user permissions #2193

Open aktech opened 9 months ago

aktech commented 9 months ago

Feature description

Currently we only filter the one's in user's namespace.

This https://github.com/nebari-dev/nebari/pull/2187 implements filtering by namespace only.

Value and/or benefit

It would be nice be be able to use the shared environments and not just the ones in shared namespace.

Anything else?

No response

nkaretnikov commented 9 months ago

@marcelovilla asked offline whether there's an API to know what environments a particular user has access to.

It can be done for the currently logged in user via a GET request to http://localhost:8080/conda-store/api/v1/permission/.

Example output for the default DB state:

{
    "status": "ok",
    "data": {
        "authenticated": true,
        "primary_namespace": "test",
        "entity_permissions": {
            "default/*": [
                "environment::read",
                "namespace-role-mapping::read",
                "namespace::read"
            ],
            "filesystem/*": [
                "environment::read",
                "namespace-role-mapping::read",
                "namespace::read"
            ],
            "*/*": [
                "build::cancel",
                "build::delete",
                "environment::delete",
                "environment::read",
                "environment::solve",
                "environment::update",
                "environment:create",
                "namespace-role-mapping::create",
                "namespace-role-mapping::delete",
                "namespace-role-mapping::read",
                "namespace-role-mapping::update",
                "namespace::create",
                "namespace::delete",
                "namespace::read",
                "namespace::update",
                "setting::read",
                "setting::update"
            ]
        },
        "entity_roles": {
            "default/*": [
                "viewer"
            ],
            "filesystem/*": [
                "viewer"
            ],
            "*/*": [
                "admin"
            ]
        },
        "expiration": "2024-01-15T17:38:04+00:00"
    },
    "message": null
}

If it needs to be done for an arbitrary user (given a name), there's code that does this on the backend side, but it's not exposed via the HTTP API.

More details: there are two versions of role mappings, which you select at startup time via the config by setting role_mappings_version. Depending on which version is used, one of these two handlers is called during auth, but there's no direct way to call these via the HTTP API.

    _role_mappings_versions = {
        1: _database_role_bindings_v1,
        2: _database_role_bindings_v2,
    }
marcelovilla commented 9 months ago

@nkaretnikov thank you for your suggestions.

For this issue, we do need to get the environments for an arbitrary username give its name as we'll use this information when spawning a jupyterhub server.

It is still unclear to me how the conda-store/api/v1/permission/ endpoint would help. Here is an example response I get when calling that endpoint:

{
  "status":"ok",
  "data":{
    "authenticated":true,
    "primary_namespace":"",
    "entity_permissions":{
      "default/*":[
        "environment::read",
        "namespace::read"
      ],
      "filesystem/*":[
        "environment::read",
        "namespace::read"
      ],
      "*/*":[
        "environment::read",
        "namespace::read"
      ]
    },
    "entity_roles":{
      "default/*":[
        "viewer"
      ],
      "filesystem/*":[
        "viewer"
      ],
      "*/*":[
        "viewer"
      ]
    },
    "expiration":"2024-01-16T15:50:46.510530"
  },
  "message":"None"
}

And here a response when calling the conda-store/api/v1/environment/ endpoint:

{
  "status":"ok",
  "data":[
    {
      "id":5,
      "namespace":{
        "id":4,
        "name":"analyst",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"web-development",
      "current_build_id":5,
      "current_build":"None",
      "description":""
    },
    {
      "id":3,
      "namespace":{
        "id":3,
        "name":"developer",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"dask",
      "current_build_id":7,
      "current_build":"None",
      "description":""
    },
    {
      "id":7,
      "namespace":{
        "id":8,
        "name":"foobar",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"foobar-private-env",
      "current_build_id":9,
      "current_build":"None",
      "description":""
    },
    {
      "id":4,
      "namespace":{
        "id":5,
        "name":"marcelo",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"marcelo-private-env",
      "current_build_id":8,
      "current_build":"None",
      "description":""
    },
    {
      "id":1,
      "namespace":{
        "id":2,
        "name":"nebari-git",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"dashboard",
      "current_build_id":1,
      "current_build":"None",
      "description":""
    },
    {
      "id":2,
      "namespace":{
        "id":2,
        "name":"nebari-git",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"dask",
      "current_build_id":2,
      "current_build":"None",
      "description":""
    },
    {
      "id":6,
      "namespace":{
        "id":7,
        "name":"users",
        "metadata_":{

        },
        "role_mappings":[

        ]
      },
      "name":"polars",
      "current_build_id":6,
      "current_build":"None",
      "description":""
    }
  ],
  "message":"None",
  "page":1,
  "size":100,
  "count":7
}

I don't see any relation between those two responses.


@dcmcand suggested using the Keycloak API to get the user groups and then using those groups as namespaces to filter the corresponding environments retrieved from the conda-store/api/v1/environment/ endpoint. I think that should work but the only issue is that environments such as nebari-git-dashbaord or nebari-git-dask would be left out (unless we hardcoded them). We would also need to pass the Keycloak credentials via a Kubernetes config map or make them available in z2jh config somehow.

nkaretnikov commented 9 months ago

Status update: discussed this during the Nebari meeting. Marcelo will try to use the existing APIs. If it doesn't work, we'll need to expose new HTTP endpoints in conda-store.

kcpevey commented 2 months ago

In order to achieve this, it will require changes to the conda-store REST API and ALSO a change to nebari. From the nebari side, we will need the ability to query keycloak to find the role mappings that users have access to.

After discussion with conda-store team, the user which will be accessing this new conda-store REST endpoint, must have access to all namespaces and environments.

viniciusdc commented 2 months ago

After discussion with conda-store team, the user which will be accessing this new conda-store REST endpoint, must have access to all namespaces and environments.

To clarify, one proposed implementation involves extending our existing code section that lists environments: https://github.com/nebari-dev/nebari/blob/57f6de698fbf2c4a89d43656518c7d0714c8298e/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterhub/02-spawner.py#L25

Instead of relying on a service-token (which serves as a root) to list all environments, we would retrieve the relevant Conda-store roles for the user (user_info )from Keycloak as these roles implicitly include the namespaces the user has access to.

We would then use the original service-token to request a temporary access-token from the Conda-store admin API. https://github.com/nebari-dev/nebari/blob/57f6de698fbf2c4a89d43656518c7d0714c8298e/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/files/jupyterhub/02-spawner.py#L31-L38

Something like this:

 environments_endpoint = "conda-store/api/v1/environment" # new one
 access_token_endpoint = "/conda-store/api/v1/token/" 

 url = yarl.URL(f"http://{external_url}/{access_token_endpoint}/") 

 http = urllib3.PoolManager() 
 access_token = response = http.request( 
     "POST", str(url), headers={"Authorization": f"Bearer {token}"} , 
     data={
        "primary_namespace": "*",
        "expiration": "2024-08-***",
        "role_bindings":  parsed_role_bindings(user_info)
    }
 ) 
 response = http.request( 
     "GET", str(url), headers={"Authorization": f"Bearer {access_token}"} 
 ) 

This temporary token will contain filtered metadata, which we will use to then request the environments that would be already filtered based on the user access (due to the temp token)

flowchart TD
    A[User Conda-store Roles from Keycloak]
    B[Conda-Store Service-Token]
    D[Temporary Access-Token from Conda-store]
    A --> D
    B --> D
    D --> E[New listing API Endpoint]
kcpevey commented 1 month ago

The endpoint for listing all envs a user has access to has been merged into conda-store 🎉