ioos / colocate

Co-locate oceanographic data by establishing constraints
MIT License
5 stars 9 forks source link

Expanding plotting functionality once dataset is identified #18

Open MathewBiddle opened 3 years ago

MathewBiddle commented 3 years ago

The idea proposed by @yutik-nn is to be able to click on a point in the response map and get/make a plot of the data. Below are some of my thoughts:

  1. Make a new function get_data(). This would be very similar to the get_coordinates function, however, instead of only returning the latitude and longitude variables it would return all the variables for the dataset.
    1. Maybe this could use the e.get_var_by_attr(dataset_id, standard_name='northward_sea_water_velocity') to only grab the data of interest (using the standard_name selected in the dropdown), but you'd also want the time, depth, latitude, and longitude potentially as well.

To determine the type of plot, you'd need to collect more metadata then we are gathering now. The metadata url for a dataset can be captured using the following construct: info_url = e.get_info_url(dataset_id=glider, response="html") (eg. https://coastwatch.pfeg.noaa.gov/erddap/info/erdCalCOFIeggcnt/index.html). If we can get the response in csv and parse that for the attribute cdm_data_type, we'd be able to make an educated guess as to how the data are organized. For a list of the cdm_data_types, see this documentation.

The last one might need a new ticket.

mwengren commented 3 years ago

Point-based map selection will be difficult with the current DataShader mapping approach in colocate-dev.ipynb, unfortunately.

I do not understand how DataShader works completely, but it essentially generalizes and grids features on the Python side before sending a rendered raster image back to the browser/JupyterLab for display. This allows us to get around the 100,000 - 1,000,000+ data point 'slowdown' effect in the map plot when using the traditional 'vector' data points in HoloViz.

I think a more realistic approach in a dashboard would be to have a pulldown list that is shown alongside the dataframe results from the 'Search Servers' button (erdddap_query.query() function). You could select a row (aka dataset) from the pulldown and show a timeseries or curtain plot for the dataset depending on the featureType/cdm_data_type of the selected dataset, rather than interacting with the map to select a point. This seems more feasible to create, at least in the short term if we continue with DataShader.

Another option would be to abandon DataShader and perhaps filter only to timeSeries datasets in erddap_query.query(), plot the resulting point locations using traditional HoloViz feature plots, and select a point (timeSeries) feature in the map to generate a timeSeries plot for. This would be more in line with @yutik-nn's idea. Or we could attempt both :). Easiest starting point for this option is going to be the original colocate.ipynb notebook that doesn't use DataShader.