ideonate / cdsdashboards

JupyterHub extension for ContainDS Dashboards
https://cdsdashboards.readthedocs.io/
Other
199 stars 38 forks source link

Parametrized dashboard #41

Open fcollonval opened 3 years ago

fcollonval commented 3 years ago

Is your feature request related to a problem? Please describe.

Dashboard can be seen as enhanced static report. And as for the latter, it would make sense to parametrize them. For example, if a dashboard is presenting a summary of events on a certain time window, it would be nice if the time window could be provided as query argument. So the dashboard is preset with those query arguments.

Describe the solution you'd like

For notebook based dashboard, a potential solution would be to use papermill. The workflow would be:

image

Describe alternatives you've considered

Background context

Configuration

danlester commented 3 years ago

Thanks for the suggestion. A few other users have spoken about something similar, maybe automated runs etc.

When you get a chance, please could you take a step back and describe a possible use case (i.e. from a business point of view)?

Ultimately, the parameterized URL must come from somewhere - generally, you wouldn't expect the dashboard viewer just to type the URL in to the browser directly - so it would be good to think through the overall workflow and reasoning etc.

Can talk on a call if easier!

sid-marain commented 3 years ago

This is definitely a feature that would be of interest to us.

A (very pertinent) business use case for our purposes would be the following:

We would like customers to have dashboards available to summarize and analyze new collections of data once the data comprising the collection has been saved to a database. However, we have a lot of metrics that could be loaded into the dashboard, not all of which are needed for a particular use case.

When a user from customer Foocorp navigates to something like https:jhub.mydomain.com/foocorp/collections/<collection_id>/dashboards, a list of possible dashboards would be made available as well as form from which the user could select relevant metrics. These correspond to templates located at some path on the persistent storage of the JupyterHub instance (the locations might be specified in a database, etc.). The templates would be parameterized to take information such as the customer_id, the collection_id, and a list of user-selected metrics (the templates themselves use the parameters to query a database located elsewhere). Upon clicking a link, the aforementioned parameters together with the would be passed to the server to corresponding template (using a GET or POST request). The rest of the flow would work as @fcollonval indicated above.

Happy to flesh this out in more detail if desired.

fcollonval commented 3 years ago

Thanks @sid-marain to beat me at describing a business use case.

In our use case, we generate URL with query arguments from technical tools to be inserted in report or in task management tool. Those URL when clicked open customized reports or tools for the non-technical people or people in other department. So it will be great if we can do the same for the dashboards.

danlester commented 3 years ago

Just some thoughts:

Of course this depends on your situation, but from a security point of view, it might make sense for there to be an API to create a copy of a dashboard but using specified parameters. (Rather than allowing the user to submit arbitrary params to the URL.)

From a workflow and hosting point of view, ideally there is a definitive point when the dashboard is 'ready', in which case it would be preferable for the dashboard to be started and already populated with the parameters, rather than already running and then having to rebuild itself based on the parameters in the URL. (Of course, in a 'high trust' scenario, you could also send users to a /build-dashboard/? URL which creates the new dashboard automatically and redirects to it when ready.)

I would also like to think through how these ideas could work within Kubeflow.

westurner commented 3 years ago

It's not parametrization (and so changes won't propagate to already-existing copies of templates), but https://github.com/jpmorganchase/jupyterlab_templates has a bit of a different workflow

https://papermill.readthedocs.io/en/latest/usage-parameterize.html#how-parameters-work :

How parameters work The parameters cell is assumed to specify default values which may be overridden by values specified at execution time.

papermill inserts a new cell tagged injected-parameters immediately after the parameters cell injected-parameters contains only the overridden parameters subsequent cells are treated as normal cells, even if also tagged parameters if no cell is tagged parameters, the injected-parameters cell is inserted at the top of the notebook

Would a new papermill engine be necessary?

https://papermill.readthedocs.io/en/latest/extending-entry-points.html#developing-a-new-engine :+1:

Developing a new engine A papermill engine is a python object that can run, or execute, a notebook. The default implementation in papermill for example takes in a notebook object, and runs it locally on your machine.

By writing a custom engine, you could allow execution to be handled remotely, or you could apply post-processing to the executed notebook. In the next section, you will see a demonstration.

Dashboard controls are great, but parametrization really would be great.

danlester commented 3 years ago

Thank you @westurner for your input.

I still think all of these ideas are great, but potentially a sufficient diversion from the core of cdsdashboards in the long run that I would need to think about funding for it as a project, and/or develop this as more of a team effort. Ideally it would almost be an extension of the (cdsdashboards) extension so that it can be reasonably independent without having to affect the central userflow!

ricky-lim commented 3 years ago

I found this from a voila example for the query parameter, https://github.com/voila-dashboards/voila/blob/master/notebooks/query-strings.ipynb

Have not yet tested this, and I was wondering if this example could also work with cdsdashboard serving voila

Cheers

fcollonval commented 3 years ago

Thanks for sharing @ricky-lim I did not know about those information being pass as environment variables to voila (see code source for a full list).

It will be much simpler than having to create a widget for just querying document.location in the front-end. The question now is does cdsdashboard pass the query args along to voila or strip them?

danlester commented 3 years ago

Very interesting, thanks for letting us know. I think it should be possible for cdsdashboards to pass this through if it doesn't already. Please let us know what happens if you've tried it!

ricky-lim commented 3 years ago

Just tested the example notebook and it does work :)

query-string

danlester commented 3 years ago

Anyone reading, please also note advancements to this discussion on Gitter.