BritishGeologicalSurvey / etlhelper

ETL Helper is a Python ETL library to simplify data transfer into and out of databases.
https://britishgeologicalsurvey.github.io/etlhelper/
GNU Lesser General Public License v3.0
104 stars 25 forks source link

Add transform option to load() and executemany() #139

Closed volcan01010 closed 1 year ago

volcan01010 commented 1 year ago

Summary

As an ETLHelper user, I want to be able to transform data passed to load or executemany so that I can do Transform-Load workflows within ETL Helper.

Description

The transform parameter on the iter_chunks-based takes a function to modify all the rows in a chunk. It is convenient and part of the Extract-Transform-Load workflow. Transforms are applied when data are extracted and there is currently no option to apply a transform if data from elsewhere e.g. a CSV file are being loaded. This should be added.

This can be implemented by checking for a transform function and calling it on each chunk, just before it is inserted, in the executemany function. The load function should be updated to take this parameter and pass it through to executemany.

Acceptance criteria

volcan01010 commented 1 year ago

This has been done and merged into for_v1.