ONSdigital / dp-data-pipelines

Pipeline specific python scripts and tooling for automated website data ingress.
MIT License
1 stars 0 forks source link

validation and testing for sdmx 2.1 transform #130

Closed osamede20 closed 2 months ago

osamede20 commented 2 months ago

What

  1. Created transform validation functions and unit tests for SDMX 2.1.
  2. Implemented validation functions into the SDMX 2.1 transform to ensure the transform script was correctly written and the output tidy csv is correct.
  3. Implemented data quality metrics by monitoring for anomalies, outliers, missing values, or unexpected duplicates etc. to ensure the integrity and accuracy of the data.

How to review

  1. Check the validations to see that they make sense and working correctly.
  2. Check unit tests have passed using make test.
  3. Possibly, try the SDMX 2.1 transform out on a xml 2.1 file.

Who can review

Any team member can review

osamede20 commented 2 months ago

Yes, there should be a test that checks the obs_dicts (which is actually a list in sdmx 2.1 transform) has all the keys in the series block. Yes, unit tests for the functions would be good. It would take some time to figure out the logic of getting the keys from the series_dicts on the raw sdmx 2.1 file.

To get this ticket over the line for this Sprint, perhaps, we could review this and sort out the suggested tests in the next Sprint. Thanks.