This tutorial is an introduction to geospatial data analysis in Python, with a focus on tabular vector data. It first focuses on introducing the participants to the different libraries to work with geospatial data and will cover munging geo-data and exploring relations over space. This includes importing data in different formats (e.g. shapefile, GeoJSON), visualizing, combining and tidying them up for analysis, and will use libraries such as pandas
, geopandas
, shapely
, PySAL
, or rasterio
. The second part will build upon this and focus on more more advanced geographic data science and statistical methods to gain insight from the data. No previous experience with those geospatial python libraries is needed, but basic familiarity with geospatial data and concepts (shapefiles, vector vs raster data) and pandas will be helpful.
7:50 - 8:20: Installation & setup
If you cannot complete the installation instructions ahead of time, please come slightly early so we can work on ensuring everyone can get the required packages installed and so that you can run the workshop material & exercises
8:20 - 9:00: Working with spatial data
9:00-10:00: Spatial Relationships & Joins
10:00 - 10:10: Break
10:10 - 11:00: Exploratory spatial data analysis
11:00 - 12:00: Leveraging space in modeling
Following this tutorial will require recent installations of:
If you do not yet have these packages installed, we recommend to use the conda package manager to install all the requirements (you can install miniconda or install the (larger) Anaconda distribution, found at https://www.anaconda.com/download/).
Once this is installed, the following command will install all required packages in your Python environment:
conda env create -f environment.yml
But of course, using another distribution (e.g. Enthought Canopy) or pip
is fine as well (a requirements file is provided as well), as long as you have the above packages installed.
NOTE: We may update the materials up until the workshop. So, please make sure that, if you download the materials, you refresh the downloaded material close to the workshop.
If you have git installed, you can get the tutorial materials by cloning this repo:
git clone https://github.com/geopandas/scipy2018-geospatial-data
Otherwise, you can download the repository as a .zip file by heading over to the GitHub repository (https://github.com/geopandas/scipy2018-geospatial-data) in your browser and click the green "Download" button in the upper right:
To make sure everything was installed correctly, open a terminal, and change its directory (cd
) so that your working directory is the tutorial materials you downloaded in the step above. Then enter the following:
python check_environment.py
Make sure that this scripts prints "All good. Enjoy the tutorial!"