oda-hub / oda_api

API client to access some of the MMODA resources: INTEGRAL, POLAR, ANTARES, LIGO/Virgo, SDSS
Other
2 stars 1 forks source link

use renku secret storage #261

Open volodymyrss opened 2 weeks ago

volodymyrss commented 2 weeks ago

Renku now allows storing user secrets, see doc. It stores secrets in files in /secrets/... .

We'd need a way for oda-api to use them by default. Maybe right in renku "default" location, and then adding an annotation to reflect this way of mounting the secret . Maybe we can discuss, but please share if there are any ideas.

volodymyrss commented 2 weeks ago

I add @dsavchenko and @burnout87 for discussion since this has to do with how the notebook is executed in other environments.

okolo commented 1 week ago

Assuming that user uses the same secret storage mechanism within entire notebook one could introduce secrete_storage annotation like this

# oda:secretStorage env_var .

or like this

# oda:secretStorage /secrets .

However in this case one will have to adopt the notebook for particular platform. We may introduce annotation for deployment platform and declare secret storage for several platforms in one notebook.

Another option would be to always use specific secret storage for particular platform and encapsulate secret retrieval in an oda-api class which will have access to platform configuration

volodymyrss commented 1 week ago

The idea is that notebook (with it's repository) is portable, so it should not be written to a particular platform. Annotation describes how the user uses the secret in the code of the notebook (and dependent libraries). Instructions for deployment are derived to satisfy this constraint.

Ideally, we'd want platforms to set the secrets as needed, following the annotation. MMODA does that. With @dsavchenko , in galaxy adaptation we could do some of that. In HPC connection we'd need to do that. Renku does not do anything like that at the moment - it does not recognize which secrets the project needs, sessions are just set in a secret (ping @rokroskar for insight since this is part of our discussion).

The last option, oda-api hiding it from a user, would work, but of course it introduces further library dependency (even if we already use oda-api for some outputs, we would not want to go too deep in that direction). So I propose indeed to go this way at this moment: make oda-api find the right location and hide platform specificity from the user.

rokroskar commented 1 week ago

I don't follow entirely - I'm not aware of any standard for such annotations, which would then mean that renku has to implement some specific support for oda annotations? Can you be a bit more precise about "Instructions for deployment are derived to satisfy this constraint." - do you have an example? It's not really feasible for renku to look into every notebook in a project to pull out cell annotations imho. If you're executing a specific notebook I can see how that's feasible but otherwise it's a bit of an intractable problem.

As we mentioned during the call - secrets are not limited to /secrets and can be made to go wherever the "standard" place is considered to be. We will make that functionality possible most likely in a few weeks.

volodymyrss commented 1 week ago

I don't follow entirely - I'm not aware of any standard for such annotations, which would then mean that renku has to implement some specific support for oda annotations? Can you be a bit more precise about "Instructions for deployment are derived to satisfy this constraint." - do you have an example? It's not really feasible for renku to look into every notebook in a project to pull out cell annotations imho. If you're executing a specific notebook I can see how that's feasible but otherwise it's a bit of an intractable problem.

I do not expect renku to recognize oda-specific way annotate a project (including its notebooks) with these kind of dependencies on external secrets, storages, etc. And as far as I know, there is indeed no general schema like this. Maybe it should be created. We created our own, based on an ontology. But I would not expect renku to immediately adopt it.

First I want to step back and make clear the use case. We want that the platform, e.g. renku can recognize that a project may need this sort of dependencies to work. Right now, there is no such link, and the user has to manually specify secret mount location after visually inspecting the project for any possible use of these secrets in particular location. For me, this is inconvenient, and especially complicates sharing and reuse of the project. Do we agree that this is a difficulty? And that we would like address it?

If we agree on the principle (but let me know if not), we can be a bit more precise in thinking how to implement it. In this direction, I am asking for your insight on possible ways to address this difficulty in renku.

From my side, I could imagine different ways. For example, maybe information about the secrets to be mounted in sessions could be kept somewhere in renku v2 project? It's already kept in renku v2 session?

It seems like renku v2 project is it's own definition of code together with datasources. But since we want to use portable git repositories defining our tools/workflows we need to reconcile what we have in the repository with what renku project defines.

If we had an API access to renku v2, maybe we could make our bot harmonize our interpretation of the project dependencies as derived from the mounted repositories with renku project settings so that new sessions have the right secrets. As an idea.

As we mentioned during the call - secrets are not limited to /secrets and can be made to go wherever the "standard" place is considered to be. We will make that functionality possible most likely in a few weeks.

That's good and useful, thanks. The point I am trying to make is beyond just this, I tried to explain. Please let me know what you think!

dsavchenko commented 1 week ago

So I propose indeed to go this way at this moment: make oda-api find the right location and hide platform specificity from the user.

BTW, we already deviated from this way, e.g. by introducing explicit annotation for oda token location here

volodymyrss commented 1 week ago

So I propose indeed to go this way at this moment: make oda-api find the right location and hide platform specificity from the user.

BTW, we already deviated from this way, e.g. by introducing explicit annotation for oda token location here

It still aligns with the idea that annotations describe what notebook use. However it is indeed unnecessary to specify token location if oda-api would hide all of the complexity. So if we go in this way, we'd just need to use annotation stating that the notebook uses oda token, irrespectively of the location, and the platform ensures that oda-api can access it. Maybe also constraining oda api version. That is, it is also possible to use oda-api to read token from a particular location. So annotation to put token in a particular location is still useful for compatibility.

rokroskar commented 1 week ago

First I want to step back and make clear the use case. We want that the platform, e.g. renku can recognize that a project may need this sort of dependencies to work. Right now, there is no such link, and the user has to manually specify secret mount location after visually inspecting the project for any possible use of these secrets in particular location. For me, this is inconvenient, and especially complicates sharing and reuse of the project. Do we agree that this is a difficulty? And that we would like address it?

Maybe I'm misinterpreting what you mean by "recognize". The way I read it initially, it implied that the platform should infer from somewhere (e.g. a code repository) what is needed automatically. But maybe you just mean that the renku v2 project itself could be configured/annotated in such a way as to remind the user at the right moment that they are missing some critical piece of information and that their session won't function without it?

If we had an API access to renku v2, maybe we could make our bot harmonize our interpretation of the project dependencies as derived from the mounted repositories with renku project settings so that new sessions have the right secrets. As an idea.

You already have API access - you can find it here: https://renkulab.io/swagger/?urls.primaryName=data+service

It's authenticated with the JWT token from keycloak and to use it from the swagger page and click "Authorize --> PKCE" using the openid scope.

volodymyrss commented 1 week ago

First I want to step back and make clear the use case. We want that the platform, e.g. renku can recognize that a project may need this sort of dependencies to work. Right now, there is no such link, and the user has to manually specify secret mount location after visually inspecting the project for any possible use of these secrets in particular location. For me, this is inconvenient, and especially complicates sharing and reuse of the project. Do we agree that this is a difficulty? And that we would like address it? Maybe I'm misinterpreting what you mean by "recognize". The way I read it initially, it implied that the platform should infer from somewhere (e.g. a code repository) what is needed automatically.

What I mean by "recognize" has some space for interpretation depending on implementation. What I want to say that there is a problem that there can be a situation, when there is a critical piece of information missing, this condition could be determined by the platform but it would not be at the moment. So every user who adds a particular git repository using some secret has to make sure the secret is mounted in the right place for this repository. And users have no way to share this configuration.

But maybe you just mean that the renku v2 project itself could be configured/annotated in such a way as to remind the user at the right moment that they are missing some critical piece of information and that their session won't function without it?

When you say "as to remind" - who would remind? Some renku functionality? If yes, renku would need to infer from somewhere (e.g. a code repository) what is needed automatically?

If we had an API access to renku v2, maybe we could make our bot harmonize our interpretation of the project dependencies as derived from the mounted repositories with renku project settings so that new sessions have the right secrets. As an idea.

You already have API access - you can find it here: https://renkulab.io/swagger/?urls.primaryName=data+service

It's authenticated with the JWT token from keycloak and to use it from the swagger page and click "Authorize --> PKCE" using the openid scope.

rokroskar commented 1 week ago

Currently we don't have a concept of dependencies between entities (runtime environments, data sources, code repositories, secrets) but I don't really see why it would not be possible. So you could declare a repository and state that to use it, it also needs a data source of a certain kind and a secret/token to access some external service. Then other people that would bring that repository into their project would see the same requirement. This could also be done in the context of the session launcher, where it would require the user to connect certain entities before being able to launch a session.

volodymyrss commented 1 week ago

Currently we don't have a concept of dependencies between entities (runtime environments, data sources, code repositories, secrets) but I don't really see why it would not be possible. So you could declare a repository and state that to use it, it also needs a data source of a certain kind and a secret/token to access some external service. Then other people that would bring that repository into their project would see the same requirement. This could also be done in the context of the session launcher, where it would require the user to connect certain entities before being able to launch a session.

This sounds good to me. It sounds like some specification for computational environments with dependencies which does not exist yet. So if it's developed here, it could be reused elsewhere. At least this will be useful when exporting renku environment for execution at HPC clusters.