SPI-Birds / pipelines

Pipelines for generating a standard data format for bird data
2 stars 6 forks source link

Quality check and pipeline code fixes #169

Closed StefanVriend closed 3 years ago

StefanVriend commented 3 years ago

We are in the process of getting feedback on the quality check procedure, report and protocol document from the advisory council. Before we send them the documents, I am fixing some bugs/issues in the quality check and pipeline codes that are revealed by the quality check procedure.

The finished pipelines of advisory council members are: NIOO, UAN, WYT, MON and PFN. Quality check will be run on subsets of the pipeline outputs (approximately 5 years) so that the quality check reports are not terrifyingly large.

Quality check protocol document is here: https://github.com/SPI-Birds/documentation/blob/master/quality_check/SPI-Birds_quality-check-protocol_v1.0.pdf

StefanVriend commented 3 years ago

After some quality check and pipeline fixes, I've run the quality check on subsets (years: 2005-2015) of the datasets, resulting in relatively small reports:

*Note that we haven't fixed the issue in the NIOO pipeline (the one related to individuals in Individual_data missing in Capture_data) yet. This subset seems to be unaffected (check I6 flags no missing records), but it might still be worth fixing before sending the documents to Marcel. @LiamDBailey - what do you think?