Use Pentaho's open source data integration tool (Kettle) to create Extract-Transform-Load (ETL) processes to update a Socrata open data portal. Documentation is available at http://open-data-etl-utility-kit.readthedocs.io/en/stable
With some of our currently manual updates we have been updating the dataset description or title with date information. I have been thinking about ways this could be integrated into the etl workflow.
One possibility... DataSync has the metadata job option. This could be integrated into the workflow to run before or after the DataSync replace job. The downside to this is that DataSync's metadata job needs to make a working copy which is unnecessary overhead.
With some of our currently manual updates we have been updating the dataset description or title with date information. I have been thinking about ways this could be integrated into the etl workflow.
One possibility... DataSync has the metadata job option. This could be integrated into the workflow to run before or after the DataSync replace job. The downside to this is that DataSync's metadata job needs to make a working copy which is unnecessary overhead.