De-duplicating / merging issues

fredhersch commented 3 years ago

Describe the issue to be researched How to handle the scenario when duplicate records are identified and merged on the server. The most likely scenarios is:

A patient is registered on different devices with data added to each creating in effect duplicate patient records. These may have different identifier
A server side process identified a duplication and a manual process is initiated to merge the records
As as result a single patient resource remains with any data from each associated with the patient
The data is then sync'd back to the devices
Does this potentially create an issue?

Describe the goal of the research

Describe the use cases
Identify specific requirements or policy
Align on priority for any requirements

Describe the methodology We will get input from the community of implementer's who are best placed to help inform this. If there is a need for this functionality we will then determine the priority.

fredhersch commented 3 years ago

@pld Can you kick this off by describing the issue and how you have handled this. Ty

pld commented 3 years ago

The scenario described matches how we approach this issue in current OpenSRP deployments. If we identify multiple patient data records that are duplicates we decide one is the "true" patient and update the references on all the data attached to the "false" patients so it points to the true patient. We then deactivate the false patients and client devices see new data on the server to sync. When clients resync the false patients are removed (at least no longer visible) and the true patient is present w/all related data.

We do not handle this automatically, but we have done some research on how we could handle it automatically and we do a reasonable amount of post collection data analysis were we and partners do duplicate merging and/or removal automatically before do analysis.

@rkodev pointed out that HAPI has a module called MDM that sounds very similar to the approach OpenSRP has taken so far and the approach we discussed:

links indicate the fact that different FHIR resources are known or believed to refer to the same actual (real world) resource.

CC @dubdabasoduba

google / android-fhir

De-duplicating / merging issues #483