harvard-lil / perma

Indelible links
420 stars 71 forks source link

Merge duplicative user accounts #3320

Closed rebeccacremona closed 1 year ago

rebeccacremona commented 1 year ago

This PR adds two invoke tasks: one for merging all duplicative user accounts that we believe can be safely merged, and one to reverse those changes.

They can be run from the command line, similar to:

Or, for more control, you can import the helper functions and run on particular users within the Django shell:

In [1]: from tasks.dev import merge_users_with_only_one_confirmed_account, initialize_csvs, unmerge_accounts

In [2]: from perma.models import LinkUser

In [3]: users = LinkUser.objects.filter(email__iexact='To_Be_Merged@example.COM')

In [4]: initialize_csvs()

In [5]: merge_users_with_only_one_confirmed_account(list(users))

In [6]: unmerge_accounts(106472)
Unmerged accounts [106470, 106471] from user 106472
[<LinkUser: to_be_merged@example.com>, <LinkUser: to_be_merged@example.COM>] from {<LinkUser: To_Be_Merged@example.COM>}

I have tested fairly thoroughly locally against a production-like database, and am pretty confident that the merging in fact works, and that the unmerging in fact reverses everything.

I would appreciate a second set of eyes on the logic of the functions that select which user in a given group is the right one to "keep"/be the target of the merge:

Do they in fact do what their docstrings claim? I think so! But, if I have a mistake, I think it's most likely to be there.

I plan to sanity check the (locally produced) logs of a few merged users from each category with @clare-stanton too.

I considered adding a dry-run arg to both tasks... but thought it probably wasn't necessary. Can add if it would make people feel better.

Closes:

codecov[bot] commented 1 year ago

Codecov Report

Merging #3320 (4ae518a) into develop (87e5722) will decrease coverage by 3.48%. The diff coverage is 1.07%.

@@             Coverage Diff             @@
##           develop    #3320      +/-   ##
===========================================
- Coverage    73.64%   70.17%   -3.48%     
===========================================
  Files           55       55              
  Lines         6641     6977     +336     
===========================================
+ Hits          4891     4896       +5     
- Misses        1750     2081     +331     
Impacted Files Coverage Δ
perma_web/tasks/dev.py 0.00% <0.00%> (ø)
perma_web/perma/models.py 86.20% <8.69%> (-3.17%) :arrow_down:

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.