lshtm-gis / WHO_PHSM_Cleaning

Cleaning PHSM provider data for WHO
https://lshtm-gis.github.io/WHO_PHSM_Cleaning/html/
MIT License
0 stars 1 forks source link

Problem ID summary tables #139

Closed hamishgibbs closed 3 years ago

hamishgibbs commented 3 years ago

We need to produce summary tables of IDs that are recorded in ID columns (previous_measure_number, following_measure_number) but are not found in the who_id field.

Most of these will be from the very early days of the dataset.

For records that are not included in the release data, i.e. records with who_code in ["10", "11", "12", "13"] - id problems are less of an issue.

It is particularly important that we identify any ID problems in records that are being included in release data.

hamishgibbs commented 3 years ago

This should be a relatively simple set difference. We should make sure to retain the original who_ids and who_code for any problem IDs.

hamishgibbs commented 3 years ago

@todowede - I will get started on this one

hamishgibbs commented 3 years ago

From CG: There should be no values in the previous_measure_number and following_measure_number for who_codes 10, 11, 12, 13 if there are they can simply be removed.

hamishgibbs commented 3 years ago

This is added in #141.