Chicago / open-data-etl-utility-kit

Use Pentaho's open source data integration tool (Kettle) to create Extract-Transform-Load (ETL) processes to update a Socrata open data portal. Documentation is available at http://open-data-etl-utility-kit.readthedocs.io/en/stable
Other
95 stars 30 forks source link
chicago etl government kettle open-data pentaho socrata

ETL Utilities for an Open Data Program

This toolkit provides several utilities and framework to help governments deploy automated ETLs using the open-source Pentaho data integration (Kettle) software.

Namely, this toolkit will allow:

The ETL framework is organized so each function can be modified in one file that is used by all ETLs. This provides for easier maintenance, upgrading, and modification over hundreds of ETLs.

Features

Requirements

The requirements for the recommended configuration require the following pieces of software:

Kettle Compatibility

This framework has only been tested using Kettle 4.3.0 and Kettle 4.4.0. It is possible that this framework is fully compatible with Kettle 5.x, but has not been tested. If you would like to contribute, please see the issue page.

Errors / Bugs

Experiencing issues with the included files? Report it on our issue tracker