ONSdigital / dp-data-pipelines

Pipeline specific python scripts and tooling for automated website data ingress.
MIT License
1 stars 0 forks source link

implement data ingress v1, task 2. #70

Closed mikeAdamss closed 5 months ago

mikeAdamss commented 5 months ago

What

PR for https://github.com/ONSdigital/dp-data-pipelines/issues/36

Basically add it the logic to run the specified transform with the specified inputs. Also ran fmt and lint and made a few linter corrections.

How to review

Setup 1:

Set the 4 env vars from here: https://github.com/ONSdigital/dp-data-pipelines/blob/sandbox/dpypelines/pipeline/shared/notification.py to the webhook (see slack channel or ask in slack for a webhook). literally export DE_SLACK_WEBHOOK=<the webhook url etc etc in your terminal.

Setup 2:

create two file in a directory on your machine:

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "$id": "https://raw.githubusercontent.com/ONSdigital/dp-data-pipelines/sandbox/schemas/dataset-ingress/config/v1.json",
    "required_files": [
        {
            "matches": "^data.xml$",
            "count": "1"
        }
    ],
    "supplementary_distributions": [
        {
            "matches": "^data.xml$",
            "count": "1"
        }
    ],
    "priority": "1",
    "pipeline": "dataset_ingress_v1",
    "options": {
        "transform_identifier": "sdmx.compact.v2.0.prototype.1"
    }
}

then run this script:

from dpypelines.pipeline.dataset_ingress_v1 import dataset_ingress_v1

dataset_ingress_v1(<PATH TO THOSE FILES>)

that should work. as far as you getting an. "everything worked" slack notification.

also, obviously check my code for issues and errors.

Who can review

Anyone