pysal / splot

Lightweight plotting for geospatial analysis in PySAL
BSD 3-Clause "New" or "Revised" License
98 stars 27 forks source link

Exploration of potential interfaces for alternative packages #22

Open slumnitz opened 6 years ago

slumnitz commented 6 years ago

Answering the question how and which interactive package to integrate or not to integrate into splot

Idea and Experiment collection in order to decide on:

To get a clear idea on what is possible and how a future integration could look like, I suggest to undertake experiments designing API's and assessing what is possible with alternative packages. I will therefore test different packages generating the

I am testing under the assumption that interactive backends are used for the exploration of data and statistical results, not necessarily for publishing in a paper (we have the matplotlib interface for this). Therefore I will limit the options to customise visualisations to a minimum. I would suggest to ideally make the interactive backend a default or if not possible provide an additional method for PySAL objects .plot_interactive(), since interactive plotting could provide an attractive alternative to interfaces commonly used in the geographic sphere (software packages like GRASS, ArcMaps...).

Interactive visualisation packages:

Testing criteria:

  1. maturity of package (active community, examples, ...)
  2. ease of use
  3. is it possible to design an API similar or exactely like the matplotlib API (without too much effort)?
  4. How much additional functionality do packages offer?
    • e.g. dataframes,
    • geo-shapes,
    • plot on top of map base layers, etc.

Possible outcomes:

Results

renanxcortes commented 6 years ago

Hi Stefanie!

I was trying to develop some plotly's functions based on the splot native functions. I think I managed to develop the scatterplot offline for either in the terminal and for jupyter notebooks.

Did you work on some of your ideas of this issue? I've heard that there's some interactivity already implemented in splot, but I couldn't find yet.

An example of a plotly scatterplot function would be:

import pandas as pd
import pysal as ps
from libpysal.weights.contiguity import Queen
import geopandas as gpd
import numpy as np
from dfply import * # dplyr functon (mask, select, group_by, etc.)

from esda.moran import Moran, Moran_Local
import plotly.offline as offline
from plotly.offline import init_notebook_mode

csv_path = ps.examples.get_path('usjoin.csv')
usjoin = pd.read_csv(csv_path)

years = list(range(1929, 2010))                  
cols_to_calculate = list(map(str, years))

shp_path = ps.examples.get_path('us48.shp')
us48_map = gpd.read_file(shp_path)
us48_map = us48_map[['STATE_FIPS','geometry']]
us48_map.STATE_FIPS = us48_map.STATE_FIPS.astype(int)
df_map = us48_map.merge(usjoin, on='STATE_FIPS')

# Making the dataset tidy
us_tidy = pd.melt(df_map, 
                  id_vars=['Name', 'STATE_FIPS', 'geometry'],
                  value_vars=cols_to_calculate, 
                  var_name='Year', 
                  value_name='Income')

# Function that calculates Per Capita Ratio
def calculate_pcr(x):
    return x / np.mean(x)

# Establishing a contiguity matrix for a specific year. It is the same for all years.
W = Queen.from_dataframe(us_tidy[us_tidy.Year == '1929'])
W.transform = 'r'

# Function that calculates lagged value
def calculate_lag_value(x):
    return ps.lag_spatial(W, x)

us_tidy['PCR'] = us_tidy.groupby('Year').Income.apply(lambda x: calculate_pcr(x))
us_tidy = us_tidy.assign(Income_Lagged = us_tidy.groupby('Year').Income.transform(calculate_lag_value),
                         PCR_Lagged = us_tidy.groupby('Year').PCR.transform(calculate_lag_value))

y = (us_tidy >> 
        mask(us_tidy.Year == '1929') >> 
        select(us_tidy.PCR))

moran = Moran(y, W)
moran_loc = Moran_Local(y, W)

#########################################
# PLOT MORAN OR LOCAL MORAN SCATTERPLOT #
#########################################
def plotly_moran_scatterplot(moran_loc,              # PySAL Moran or Local-Moran object
                             zstandard = True,
                             reference_lines = True, # Horizontal and Vertical lines
                             jupyter = False,        # If user is running in a jupyter notebook
                             marker_size = 5, 
                             marker_color = 'blue',
                             fit_line = True,        # Fit regression line
                             line_width = 1.5,
                             line_color = 'red'):

    if(zstandard == True):
        Var = moran_loc.z
    if(zstandard == False):
        Var = moran_loc.y

    VarLag = ps.lag_spatial(moran_loc.w, Var)

    if(fit_line == True):
        b,a = np.polyfit(Var, VarLag, 1)
        fit_line_data = {'x': [min(Var), max(Var)], 
                     'y': [a + i * b for i in [min(Var), max(Var)]],
                     'mode': 'lines',
                     'line': {'width': line_width,
                              'color': line_color}}
    else:
        fit_line_data = {}

    if(reference_lines == True):
        h_line_data = {'x': [min(Var), max(Var)], 
                       'y': [Var.mean(), Var.mean()],
                       'mode': 'lines',
                       'line': {'width': 1,
                                'color': 'gray'}}
        v_line_data = {'x': [VarLag.mean(), VarLag.mean()], 
                     'y': [min(VarLag), max(VarLag)],
                     'mode': 'lines',
                     'line': {'width': 1,
                              'color': 'gray'}}
    else:
        h_line_data = {}
        v_line_data = {}

    fig = {
        'data': [
            {
                'x': Var, 
                'y': VarLag,
                'mode': 'markers',
             'marker': {'size': marker_size,
                        'color': marker_color},
             'text': moran_loc.p_sim}, 
         fit_line_data,
         h_line_data,
         v_line_data         
        ],

        'layout': {
            'xaxis': {'title': 'Original Variable', 
                      'showgrid': False,
                      'zeroline': False},
            'yaxis': {'title': 'Lagged Variable',
                      'showgrid': False,
                      'zeroline':False},
            'showlegend': False,
            'title': 'Moran Scatterplot'
        }
    }

    if(jupyter == False):    
        plotly_fig     = offline.plot(fig)
    if(jupyter == True):
        init_notebook_mode(connected=True)
        plotly_fig = offline.iplot(fig)

    return plotly_fig

# Example
plotly_moran_scatterplot(moran_loc,
                         zstandard = False, 
                         reference_lines = True,
                         marker_size = 5,
                         marker_color = 'blue',
                         fit_line = True, 
                         line_width = 2.5,
                         line_color = 'red'
                         )