Anthony-Nolan / Atlas

A free & open-source Donor Search Algorithm Service
GNU General Public License v3.0
9 stars 5 forks source link

A full import should wipe existing donors #742

Closed benbelow closed 5 months ago

benbelow commented 2 years ago

IDs are globally unique in donor datasets received from WMDA. However, a donor may have a new ID when WMDA send a new full update (i.e. a new full donor dataset). This card covers investigating how to deal with this.

The currently proposed solution is this:

Importing the new donor data would require first wiping all relevant donors in the donor store:

If all WMDA donors are in one dataset, the whole donor store would need to be wiped.

If the WMDA donors are in registry-specific datasets, only donors belonging to those registries would need to be wiped.

WARNING - we will need to be very careful when implementing this card - as this will only be feasible in the current set up if full imports are either in a single file, or batched according to an identifier that Atlas can store and group by.

Anthony Nolan's installation of Atlas performs a full import in arbitrary batches of 10,000 donors, so without a more major architectural re-design, this feature cannot be implemented without breaking AN's installation.

Anyone picking up this issue should work closely with both AN and WMDA to ensure the solution meets the needs of both known installations of Atlas.

zabeen commented 5 months ago

So far there has been no need to perform full import once an installation has been setup. New donor checker functionality has been added since this ticket that allows external donor service to check that Atlas has the correct donors, and send a diff file to correct, if not.