Add a static method to the EarthbeamDAG class that takes in one or more CSVs and saves them as parquet files partitioned by tenant_code and api_year (or equivalent columns)
Testing
The easiest way to test this code is by running the included unit tests. Since there was no existing test framework attached to this repo, I took the liberty of initializing some Pytest configuration. In order to run the tests, do the following:
In a virtual environment, run pip install -r requirements-dev.txt
This will likely fail because of the dependencies on non-public packages edfi_api_client and ea_airflow_util. You will need to install these locally for the environment to be complete
This is annoying locally, but would not be an issue in an automated environment -- if we ran the unit tests as part of a Github action, we would have access to the necessary repos
Running pytest is sufficient to run the entire suite. For more detail, you can run pytest -v
The test file serves as documentation for how the new function can be used.
Pertains to this ticket
Add a static method to the
EarthbeamDAG
class that takes in one or more CSVs and saves them as parquet files partitioned bytenant_code
andapi_year
(or equivalent columns)Testing
The easiest way to test this code is by running the included unit tests. Since there was no existing test framework attached to this repo, I took the liberty of initializing some Pytest configuration. In order to run the tests, do the following:
pip install -r requirements-dev.txt
pytest
is sufficient to run the entire suite. For more detail, you can runpytest -v
The test file serves as documentation for how the new function can be used.