ouhft / COPE

Project Repository for Work Package 4 of the COPE Transplant Trial
https://cope.nds.ox.ac.uk
1 stars 0 forks source link

Duplicate Staff Profiles #180

Closed marshalc closed 7 years ago

marshalc commented 7 years ago

There are duplicate Staff Profiles in the live data, and it's not yet clear how they came into being. In each case there appears to be slightly different levels of detail in each.

Identified so far (name and SP-IDs)

Some of these accounts are technicians with user accounts too, so there are extra questions as to why those users are duplicated because that can only have been done by either @AllyBradley or @mertenssarah.

Also possibly:

Additionally we have a number of entries with questionable data in them (i.e. Dr. as first name; ? as first name or surname, or one of my favourites: "OK Assistent" & "dr unknown").

For most of these, a lesson about being more careful and thorough with data entry needs to be emphasised, because this is human error (the exception may be Annelies Degrauwe).

I'm going to put investigating this data mess on the back burner for now until I have more information as to how this is occurring so often (which means assembling the audit log for each person and the cases they've been linked to), and this isn't a priority for me presently.

marshalc commented 7 years ago

To help counter this from reoccurring, two changes have been added to the Staff Person model:

i.e. there can not be any duplicate records with matching names, or email addresses.

marshalc commented 7 years ago

Staff Person Migration 3 will now fail if there are existing records with duplicate values in, thus the live, test, and development databases all need cleaning up before this release can be activated.

marshalc commented 7 years ago

Staff Person is referenced by:

Need to check all these fields for each of the above people, and then settle on a single instance...

marshalc commented 7 years ago

Note to self: Make the changes to the records via the Admin interface so that a full audit trace is kept for the future!

marshalc commented 7 years ago

Order of editing:

marshalc commented 7 years ago

Of course, because of #182, you can't select the new record unless the job title is in place. Doh.

Will ponder the workaround for this over the weekend.

marshalc commented 7 years ago

So we have two non-negotiable points:

  1. Need to clean the data to remove bad results / duplicates
  2. Need to clean the data in an audit-able fashion

We additionally want to stop more of 1 occurring by putting rules and processes in place that limit or stop this, but before we can do that, we have to have the data clean first, because otherwise those limits (and existing ones, such as the filter for job role on person select) have to be put aside.

This means we'll need to do a pre-0.7.0 release of a specific codebase that removes those restrictions, during which time we can clean the data, before then releasing the full 0.7.0 codebase.

To that end (and because I need clean data to develop against too), I'll do a 0.6.5 branch with these restrictions removed, to be used as a temporary release. Because this will be such a short lived release (measurable in hours and minutes I hope), there is no value in adding other features to this release cycle.

marshalc commented 7 years ago

Query for @mertenssarah and @aukjebrat : Are Annelies Degrauwe and Annelies Devrouwe, different people? They are registered with the same location and same telephone number. And if not, which is the correct spelling?

mertenssarah commented 7 years ago

@marshalc Which hospital do they belong to and what is their role? Only then I can check what is the correct name (or that both names are correct)

marshalc commented 7 years ago

Building tools to allow profiles to be selected and merged in the Administration. {merge_staff_person}

marshalc commented 7 years ago

The data is now being manually processed and superseded by work on #208 which replaces StaffPerson completely.