Exploration of potential interfaces for alternative packages

pysal / splot

Lightweight plotting for geospatial analysis in PySAL

BSD 3-Clause "New" or "Revised" License

98 stars 27 forks source link

Answering the question how and which interactive package to integrate or not to integrate into `splot`

Idea and Experiment collection in order to decide on:

how different alternative, interactive packages can be integrated into splot and
what PySAL functionality could/ should be supported interactively in future

To get a clear idea on what is possible and how a future integration could look like, I suggest to undertake experiments designing API's and assessing what is possible with alternative packages. I will therefore test different packages generating the

[ ] esda.moran.Moran_Local (scatterplot, LISA map, Choropleth map) and
[ ] (if there is time the giddy.directional(heatmap, LISA maps, rose plot)) visualisation. I am aiming for an API design that is as close as possible to the matplotlib API. This will help to provide an insight into whether or not it is viable to use a set_backend('package') option, to quickly change backend packages or not.

I am testing under the assumption that interactive backends are used for the exploration of data and statistical results, not necessarily for publishing in a paper (we have the matplotlib interface for this). Therefore I will limit the options to customise visualisations to a minimum. I would suggest to ideally make the interactive backend a default or if not possible provide an additional method for PySAL objects .plot_interactive(), since interactive plotting could provide an attractive alternative to interfaces commonly used in the geographic sphere (software packages like GRASS, ArcMaps...).

Interactive visualisation packages:

Bokeh (GeoViews)
Altair
Folium (Potential limit to underlying base maps only--> Functionality limited)
Plotly (hard to use and not well documented for offline use without a subscription)
D3 (Lower priority since JavaScript based and I am not familiar with JavaScript)

Testing criteria:

maturity of package (active community, examples, ...)
ease of use
is it possible to design an API similar or exactely like the matplotlib API (without too much effort)?
How much additional functionality do packages offer?
- e.g. dataframes,
- geo-shapes,
- plot on top of map base layers, etc.

Possible outcomes:

[ ] no use of interactive packages, instead continuing with interactive matplotlib functionality (widgets, masking options, ...)
[ ] use of interactive package as set_backend('bk') option
[ ] use of interactive package for additional functionality (plotting on map base layers)

Results

import pandas as pd import pysal as ps from libpysal.weights.contiguity import Queen import geopandas as gpd import numpy as np from dfply import * # dplyr functon (mask, select, group_by, etc.) from esda.moran import Moran, Moran_Local import plotly.offline as offline from plotly.offline import init_notebook_mode csv_path = ps.examples.get_path('usjoin.csv') usjoin = pd.read_csv(csv_path) years = list(range(1929, 2010)) cols_to_calculate = list(map(str, years)) shp_path = ps.examples.get_path('us48.shp') us48_map = gpd.read_file(shp_path) us48_map = us48_map[['STATE_FIPS','geometry']] us48_map.STATE_FIPS = us48_map.STATE_FIPS.astype(int) df_map = us48_map.merge(usjoin, on='STATE_FIPS') # Making the dataset tidy us_tidy = pd.melt(df_map, id_vars=['Name', 'STATE_FIPS', 'geometry'], value_vars=cols_to_calculate, var_name='Year', value_name='Income') # Function that calculates Per Capita Ratio def calculate_pcr(x): return x / np.mean(x) # Establishing a contiguity matrix for a specific year. It is the same for all years. W = Queen.from_dataframe(us_tidy[us_tidy.Year == '1929']) W.transform = 'r' # Function that calculates lagged value def calculate_lag_value(x): return ps.lag_spatial(W, x) us_tidy['PCR'] = us_tidy.groupby('Year').Income.apply(lambda x: calculate_pcr(x)) us_tidy = us_tidy.assign(Income_Lagged = us_tidy.groupby('Year').Income.transform(calculate_lag_value), PCR_Lagged = us_tidy.groupby('Year').PCR.transform(calculate_lag_value)) y = (us_tidy >> mask(us_tidy.Year == '1929') >> select(us_tidy.PCR)) moran = Moran(y, W) moran_loc = Moran_Local(y, W) ######################################### # PLOT MORAN OR LOCAL MORAN SCATTERPLOT # ######################################### def plotly_moran_scatterplot(moran_loc, # PySAL Moran or Local-Moran object zstandard = True, reference_lines = True, # Horizontal and Vertical lines jupyter = False, # If user is running in a jupyter notebook marker_size = 5, marker_color = 'blue', fit_line = True, # Fit regression line line_width = 1.5, line_color = 'red'): if(zstandard == True): Var = moran_loc.z if(zstandard == False): Var = moran_loc.y VarLag = ps.lag_spatial(moran_loc.w, Var) if(fit_line == True): b,a = np.polyfit(Var, VarLag, 1) fit_line_data = {'x': [min(Var), max(Var)], 'y': [a + i * b for i in [min(Var), max(Var)]], 'mode': 'lines', 'line': {'width': line_width, 'color': line_color}} else: fit_line_data = {} if(reference_lines == True): h_line_data = {'x': [min(Var), max(Var)], 'y': [Var.mean(), Var.mean()], 'mode': 'lines', 'line': {'width': 1, 'color': 'gray'}} v_line_data = {'x': [VarLag.mean(), VarLag.mean()], 'y': [min(VarLag), max(VarLag)], 'mode': 'lines', 'line': {'width': 1, 'color': 'gray'}} else: h_line_data = {} v_line_data = {} fig = { 'data': [ { 'x': Var, 'y': VarLag, 'mode': 'markers', 'marker': {'size': marker_size, 'color': marker_color}, 'text': moran_loc.p_sim}, fit_line_data, h_line_data, v_line_data ], 'layout': { 'xaxis': {'title': 'Original Variable', 'showgrid': False, 'zeroline': False}, 'yaxis': {'title': 'Lagged Variable', 'showgrid': False, 'zeroline':False}, 'showlegend': False, 'title': 'Moran Scatterplot' } } if(jupyter == False): plotly_fig = offline.plot(fig) if(jupyter == True): init_notebook_mode(connected=True) plotly_fig = offline.iplot(fig) return plotly_fig # Example plotly_moran_scatterplot(moran_loc, zstandard = False, reference_lines = True, marker_size = 5, marker_color = 'blue', fit_line = True, line_width = 2.5, line_color = 'red' )

pysal / splot

Exploration of potential interfaces for alternative packages #22

Answering the question how and which interactive package to integrate or not to integrate into splot

Answering the question how and which interactive package to integrate or not to integrate into `splot`