Closed NickPapONS closed 5 months ago
pushed some changes as this one was a little snaggled. Two main issues going on.
1.) We were catching errors a little early, it was obfiiscating some pathing issues.
2.) We updates the pipeline for a new config shape a few pr's ago, needed to update the fixture config to reflect that, for context a valid config looks like this now:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"$id": "https://raw.githubusercontent.com/ONSdigital/dp-data-pipelines/sandbox/schemas/dataset-ingress/config/v1.json",
"required_files": [
{
"matches": "^data.sdmx$",
"count": "1"
}
],
"supplementary_distributions": [
{
"matches": "^data.sdmx$",
"count": "1"
}
],
"priority": "1",
"pipeline": "dataset_ingress_v1",
"options": {
"transform_identifier": "sdmx.compact.v2.0.prototype.1"
}
}
I also added a "no expections" and "an exception" step, mainly to help check the above was working, plus a few minor code changes.
What
Adds some functionality to create basic data fixtures (for now containing random dummy data) that can be extracted from an archive to be used for acceptance tests. The data is taken from a zipfile if it doesn't exist, then the relevant files requested by the test's context are placed in a temporary directory which is then used for the remainder of the test.
How to review
Check the before_all functionality that ensures the directory with the fixture data inside exists and see if it makes sense. Check data.py and see if the code structure makes sense and fulfills its purpose. There's a lot of iterating/matching things in a dictionary going on which can get a bit complicated. Let me know if you feel it needs some edge case coverage/error catching somewhere.
Who can review
Anyone