agrc / forklift

:tractor::package::sparkles: Slinging data all over the place :tractor::package::sparkles:
MIT License
28 stars 3 forks source link
arcgis arcgis-server esri etl fgdb file-geodatabase geospatial geospatial-data python

πŸšœπŸ“¦βœ¨ forklift

conda python license GitHub tag (latest SemVer)

A python CLI tool for managing and organizing the repetitive tasks involved with keeping remote geodatabases in sync with their sources. In other words, it is a tool to tame your scheduled task nightmare.

basically forklift

https://xkcd.com/2054/

Rules

The first rule of :tractor: is it does not work on any sabbath.

The second rule of :tractor: is that it's out of your element Donny.

Usage

The work that forklift does is defined by Pallets. forklift.models.Pallet is a base class that allows the user to define a job for forklift to perform by creating a new class that inherits from Pallet. Each pallet should have Pallet in it's file name and be unique among other pallets run by forklift.

A Pallet can have zero or more Crates. forklift.models.Crate is a class that defines data that will be moved from one location to another (reprojecting to web mercator by default). Crates are created by calling the add_crates (or add_crate) methods within the build method on the pallet. For example:

class MyPallet(Pallet):
    def __init__(self):
        #: this is required to initialize the Pallet base class properties
        super(MyPallet, self).__init__()

    def build(self, configuration)
        #: all operations that can throw an exception should be done in build
        destination_workspace = 'C:\\MapData'
        source_workspace = path.join(self.garage, 'connection.sde')

        self.add_crate('Counties', {'source_workspace': source_workspace,
                                    'destination_workspace': destination_workspace})

For details on all of the members of the Pallet and Crate classes see models.py.

For examples of pallets see samples/PalletSamples.py.

CLI

Interacting with forklift is done via the command line interface. Run forklift -h for a list of all of the available commands.

Config File Properties

config.json is created in the working directory after running forklift config init. It contains the following properties:

Any of these properties can be set via the config set command like so:

forklift config set --key sendEmails --value False

If the property is a list then the value is appended to the existing list.

Metadata

Metadata is only copied from source to destination when the destination is first created, not on subsequent data updates. If you want to push metadata updates, delete the destination in the hashing folder and then it will be updated when it is recreated on the next lift.

Install to First Successful Run

From within the ArcGIS Pro conda environment (c:\Program Files\ArcGIS\Pro\bin\Python\scripts\proenv.bat):

  1. Install git.

  2. Install Visual Studio Build tools with the Desktop development with C++ module

  3. Install ArcGIS Pro.

  4. Add ArcGIS Pro to your path.

    • If installed for all users: c:\Program Files\ArcGIS\Pro\bin\Python\scripts\.
    • If install for single user: C:\Users\{USER}\AppData\Local\Programs\ArcGIS\Pro\bin\Python\Scripts.
  5. Create a conda environment for forklift conda create --name forklift python=3.9.

  6. Activate the conda environment activate forklift.

  7. conda install arcpy -c esri

  8. Checkout forklift repository: git clone https://github.com/agrc/forklift.git

  9. pip install .\ from the directory containing setup.py.

  10. Install the python dependencies for your pallets.

  11. forklift config init

  12. forklift config repos --add agrc/parcels - The agrc/parcels is the user/repo to scan for Pallets.

  13. forklift garage open - Opens garage directory. Copy all connection.sde files to the forklift garage.

  14. forklift git-update - Updates pallet repos. Add any secrets or supplementary data your pallets need that is not in source control.

  15. Edit the config.json to add the arcgis server(s) to manage. The options property will be mixed in to all of the other servers.

    • username ArcGIS admin username.
    • password ArcGIS admin password.
    • host ArcGIS host address eg: myserver. Validate this property by looking at the machineName property returned by /arcgis/admin/machines?f=json
    • port ArcGIS server instance port eg: 6080
    "servers": {
      "options": {
          "username": "mapserv",
          "password": "test",
          "port": 6080
      },
      "primary": {
          "host": "this.is.the.qualified.name.as.seen.in.arcgis.server.machines",
      },
      "secondary": {
          "host": "this.is.the.qualified.name.as.seen.in.arcgis.server.machines"
      },
      "backup": {
          "host": "this.is.the.qualified.name.as.seen.in.arcgis.server.machines",
          "username": "test",
          "password": "password",
          "port": 6443
      }
    }
  16. Edit the config.json to add the email notification properties. (This is required for sending email reports)

    • smtpServer The SMTP server that you want to send emails with.
    • smtpPort The SMTP port number.
    • fromAddress The from email address for emails sent by forklift.
    "email": {
       "smtpServer": "smpt.server.address",
       "smtpPort": 25,
       "fromAddress": "noreply@utah.gov"
    }
  17. forklift lift

  18. forklift ship

run_forklift.bat is an example of a batch file that could be used to run forklift via the Windows Scheduler.

Upgrading Forklift

From the root of the forklift source code folder:

  1. Activate forklift environment: activate forklift
  2. Pull any new updates from GitHub: git pull origin master
  3. Pip install with the upgrade option: pip install .\ -U

Upgrading ArcGIS Pro

  1. Upgrade ArcGIS Pro

There is no second step if you originally created a fresh conda environment (not cloned from arcgispro-py3) and installed arcpy via conda install arcpy -c esri.

If you do need to recreate the forklift environment from scratch, follow these steps:

  1. Copy the forklift-garage folder to a temporary location.
  2. Activate forklift environment: activate forklift
  3. Export conda packages: conda env export > env.yaml
  4. Export pip packages: pip freeze > requirements.txt
  5. Remove and make note of any packages in requirements.txt that are not published to pypi such as forklift.
  6. Deactivate forklift environment: deactivate
  7. Remove forklift environment: conda remove --name forklift --all
  8. Create new forklift environment: conda create --clone arcgispro-py3 --name forklift --pinned
  9. Activate new environment: activate forklift
  10. Reinstall conda packages: conda env update -n forklift -f env.yaml
  11. Reinstall pip packages: pip install -r requirements.txt
  12. Copy the forklift-garage folder to the site-packages folder of the newly created environment.
  13. Reinstall forklift and any other missing pip package (from root of project): pip install .\

Development Usage

Tests

Tests should show up in VSCode's text explorer.

To run them from the command line:

_Tests that depend on a local SDE database (see tests/data/UPDATE_TESTS.bak) will automatically be skipped if it is not found on your system._

To run a specific test or suite: pytest -k <test/suite name>