Explore alternatives to having a bunch of plots/images in notebooks

andersy005 commented 4 years ago

As @dcherian pointed out in https://github.com/marbl-ecosys/HiRes-CESM-analysis/pull/21#issuecomment-682871521

Perhaps notebooks are not the solution here --- these notebooks have no text...

How about some kind of "image browser" that takes as input a folder and a filename schema e.g. {{variable}}_{{depthlevel}}_{{time}}.png and then offered selection by variable depthlevel and time.

How do we expect to "consume" these images?

There's probably some way to make napari.org do this if we wanted an application interface that runs locally

We could use some holoviews thing; but this requires a python process (AFAIK) and so would again be run locally (OK could also run on a webserver, but not on Github pages AFAIK)

There's probably a javascript thing that would run in the browser; this would work for a publicly accessible website.

Another advantage of the "image browser" idea is that it's quite easy to generate images in parallel using xr.map_blocks

As per offline conversations with @matt-long, there are a few things to consider here

User experience: 1) Pre-staging the data needed for producing the plots into zarr stores in a bucket on S3 or Google storage or some FTP server on CGD's system or 2) Dumping static images into some kind of storage such as a local directory or remote storage
Interface:
- A static image viewer similar to what @dcherian suggested. This image viewer could be based on panel or
- A dynamic image viewer that uses pre-staged data. This would allow us to produce dynamic images w/ zoom in and out features and other cool features.
Deployment: how to deploy the image viewer depends on the interface chosen. With an image viewer that serves static images, we could deploy the application on a CGD system. W/ an image viewer that serves dynamic images (using pre-staged data), we will need to get (1) someone to foot the bill and (2) access to a system that would allow us to run a dynamic web application with access to pre-staged data.

In the meantime, I've started working on a prototype for an image viewer serving static images from a local directory. This app will be based on panel, and I am hoping to have a working prototype by the end of the week.

matt-long commented 4 years ago

I would add that under "User experience" we want to enable batch-processing/automation of the workflow generating diagnostic output.

We need a phased approach to tackling these, focusing first on ensuring we can look at what we need to see from the run.

klindsay28 commented 4 years ago

A feature that I'd like to have is the ability to examine 2 (arbitrary) figures simultaneously (e.g., side-by-side). This feature is not available in current approaches to providing data that I'm familiar with, but I think it would be tremendously helpful.

andersy005 commented 4 years ago

A feature that I'd like to have is the ability to examine 2 (arbitrary) figures simultaneously (e.g., side-by-side)

Could you expand on this? I am asking because I current have the following: Screen Shot 2020-09-02 at 10 23 32 AM

As you can see I am only showing one variable, one time step, and log_10 or no log_10 versions of the image at a time. How should I go about selecting the second figure to show side-by-side?

klindsay28 commented 4 years ago

Can you introduce a duplicate of each control that would control the display of another figure?

Examples of how I would use this side-by-side figure functionality are 1) different fields (e.g., photoC_diat_zint, photoC_cocco_zint) at the same time level 2) the same field at different time levels (e.g., photoC_diat_zint for Jan-0002 and for Jan-0003)

Don't take 'side-by-side' literally. I just want to be able to see 2 figures simultaneously.

A bonus would be to expand this to >2 figures, with a control for how many figures to display. But just being able to depict 2 figures simultaneously would be a large improvement.

dcherian commented 4 years ago

@andersy005 can you compose that panel like standard holoviz? For extra points we could link the sliders :D

img1 = ImageView(file_path)
img2 = ImageView(file_path)

(img1 + img2).cols(2)

andersy005 commented 4 years ago

As an update, here're a few screenshots of the latest version of the static image viewer:

Figuring out the appropriate figure size depending on how many axes are being plotted is still work in progress though. Overall, I think it is coming along nicely

Screen Shot 2020-09-03 at 8 23 31 PM Screen Shot 2020-09-03 at 8 21 39 PM Screen Shot 2020-09-03 at 8 20 56 PM

Screen Shot 2020-09-03 at 8 26 46 PM Screen Shot 2020-09-03 at 8 31 02 PM Screen Shot 2020-09-03 at 8 28 15 PM

andersy005 commented 4 years ago

I came up with some heuristics to adjust the number of rows and columns accordingly, and to remove empty subplots:

Screen Shot 2020-09-03 at 9 04 07 PM Screen Shot 2020-09-03 at 9 03 37 PM Screen Shot 2020-09-03 at 9 02 51 PM

andersy005 commented 4 years ago

Added "Tabs" to the interface. The user can choose between looking at multiple variables at the same time or one variable at multiple time steps:

Screen Shot 2020-09-04 at 5 35 38 AM

dcherian commented 4 years ago

Anderson this looks awesome.

Before you go too much further, it might be good to see https://github.com/holoviz/hvplot/pull/505 and how it looks. It seems this would reduce the amount of work needed. But I could be wrong?

martindurant commented 4 years ago

Is there any interest in integrating this as one of the viewers inside of Intake's panel-based GUI, or another way of populating the viewer from catalogued data sources? Intake currently optionally uses xrviz to display xarray data, where you can pick various options, but only get one single display. xrviz is not being particularly maintained, unless I should happen to come across a batch of spare hours.

matt-long commented 4 years ago

Thanks for the comment @martindurant. I had not previously see xrviz and it looks really cool. One challenge we have is that some plots take a long time to generate, so we are caching image files and displaying these. Ultimately, it would be preferable to pre-compute or subset the data behind the plot to enable more interactive visualization—but that could still be a lot of data.

martindurant commented 4 years ago

Yes, I feel like there is some space inbetween here. It would be plausible (but requiring effort) to include thumbnails or paths to full precomputed images in a catalogue, for instance, or other more specific metadata. There have been suggestions for zarr v3 to have multi-resolution data that would allow more interactively.

andersy005 commented 4 years ago

As an update, I took a stab at a prototype for a small package to facilitate the creation of an image viewer from static images. The package resides here: https://github.com/andersy005/panelify

Here's a demo notebook: https://nbviewer.jupyter.org/github/andersy005/HiRes-CESM-dashboard/blob/master/panel-image-dashboard.ipynb
Here's how the API looks so far:

import panelify 

# instantiate dashboards for different types of plots

timeseries = panelify.create_dashboard(
    keys=[
        "casename",
        "varname",
        "depth_level",
        "spatial_op",
        "time_coarsen_len",
        "log_10",
    ],
    df=df.loc[df.plot_type == "global-timeseries"],
    path_column="path",
)
timestep = panelify.create_dashboard(
    keys=["casename", "varname", "depth_level", "time", "log_10"],
    df=df.loc[df.plot_type == "timestep-global-map"],
    path_column="path",
)
histogram = panelify.create_dashboard(
    keys=["casename", "varname", "depth_level", "time_range", "log_10"],
    df=df.loc[df.plot_type == "histogram"],
    path_column="path",
)


# Create a canvas that holds all dashboards in a `panel.Tabs` component:
canvas = panelify.Canvas(
    {
        "Timestep": timestep.view,
        "Timeseries": timeseries.view,
        "Histogram": histogram.view,
    }
)
canvas.show()

This results in a Panel dashboard with three tabs:

I'm going to work on pushing it forward this week. I'm hoping to have a minimum viable product by the end of the week. Feel free to chime in at https://github.com/andersy005/panelify/issues

marbl-ecosys / HiRes-CESM-analysis

Explore alternatives to having a bunch of plots/images in notebooks #26