As an ETLHelper user, I want to be able to transform data passed to load or executemany so that I can do Transform-Load workflows within ETL Helper.
Description
The transform parameter on the iter_chunks-based takes a function to modify all the rows in a chunk. It is convenient and part of the Extract-Transform-Load workflow. Transforms are applied when data are extracted and there is currently no option to apply a transform if data from elsewhere e.g. a CSV file are being loaded. This should be added.
This can be implemented by checking for a transform function and calling it on each chunk, just before it is inserted, in the executemany function. The load function should be updated to take this parameter and pass it through to executemany.
Acceptance criteria
[x] executemany takes a transform parameter
[x] load takes a transform parameter
[x] CSV loading example in README is updated to use transform
[x] executemany example in tests/etl/test_abort.py is updated to use transform
Summary
As an ETLHelper user, I want to be able to transform data passed to load or executemany so that I can do Transform-Load workflows within ETL Helper.
Description
The transform parameter on the
iter_chunks
-based takes a function to modify all the rows in a chunk. It is convenient and part of the Extract-Transform-Load workflow. Transforms are applied when data are extracted and there is currently no option to apply a transform if data from elsewhere e.g. a CSV file are being loaded. This should be added.This can be implemented by checking for a transform function and calling it on each chunk, just before it is inserted, in the
executemany
function. Theload
function should be updated to take this parameter and pass it through toexecutemany
.Acceptance criteria