crackernutter / EsriRESTScraper

A Python class that scrapes ESRI Rest Endpoints and exports data to a geodatabase
MIT License
48 stars 20 forks source link
arcgis-server esri featureclass geodatabase geometry ijson polygon python rest-api schema scraper

EsriRESTScraper

A lightweight Python (tested in Python 3.x and 2.x) module that scrapes ESRI REST endpoints and parses the data into a local or enterprise geodatabase feature class

Updates for Python 3

Some updates in the latest release

Dependencies

Instructions

This class is instantiated with the Esri REST Endpoint of a feature layer inside a map service. For secured map services, you can include an optional token when instantiating the class.
e.g.

import RestCacheClass
earthquakesScraper = RestCacheClass.RestCache("https://earthquake.usgs.gov/arcgis/rest/services/eq/event_30DaySignificant/MapServer/0")

Constructor

def __init__(self, url, token=None, userFields=[], excludeFields=[]):

The RestCache object when instantiated scrapes the feature layer page for it's various attributes: fields, wkid, max record count, name, and geometry type.

This class has three primary methods:

  1. createFeatureClass
  2. updateFeatureClass
  3. recreateFeatureClass

createFeatureClass

This method creates a feature class in a geodabase with the appropriate spatial reference, geometry type, and the appropriate fields (and field lengths, etc). This method really only needs to be run a single time, then you have the correct feature class locally, and all you need to do is update it.

The name of the feature class is derived from the name in the REST endpoint, although this can be overwritten with an optional parameter.

Signature

def createFeatureClass(self, location, name, excludeFields=[]):

Issues:

  1. The method only supports creating a feature class in a geodatabase (enterprise or local), not a shapefile. If someone wants to modify this to support creating other types of workspaces, please do so!!
  2. Some field types are not supported either, although the most common ones are: text, date, short, long, double, float.
earthquakesFeatureClass = earthquakesScraper.createFeatureClass(r'C:\Geodata\earthquakes.gdb', 'earthquakes')

updateFeatureClass

This method makes one or more REST calls, parses the data, and updates your local geodatabase feature class. Pretty straightforward. This method accepts as input the feature class to update, a single query or list of queries (the default is "1=1"), and a Boolean value on whether to append to the existing dataset or overwrite (default is to overwrite since I didn't want to deal with differentials).

The method will gracefully end if there is a schema mismatch between the REST endpoint and the feature class to update. This is to account for situations when the service definition changes without your knowledge. However, you can also account for schema differences using the excludeFields and userFields parameters.

The method will capture all records in a query, even if a query would return more records than the MaxRecordCount, if the service supports pagination (most do). Otherwise, it will fail if the query would return more records than the MaxRecordCount.

earthquakesScraper.updateFeatureClass(earthquakesFeatureClass, ["magnitude > 4"])

Signature

def updateFeatureClass(self, featureClassDestination, query=["1=1"], append=False, userFields=[], excludeFields=[], debug=False, debugLoc=sys.path[0]):

recreateFeatureClass

This method will essentially recreate the feature class to match the current service definition schema. It deletes all fields, and re-adds the fields from the service definition. Useful for a workflow where you catch a SchemaMismatch error, recreate the feature class, then update.

Signature

def recreateFeatureClass(self, target, userFields=[], excludeFields=[]):
        """Method to recreate target feature class by recreating fields from REST Endpoint
        Can be invoked if SchemaMismatch error is thrown and caught"""

Please let me know if you have any questions!

Previous version updates

Some less important updates