BiologicalRecordsCentre / iRecord

Repository to store and track enhancements, issues and tasks regarding the iRecord website.
http://irecord.org.uk
2 stars 1 forks source link

Tracking user accounts when email addresses change across multiple systems #1110

Open johnvanbreda opened 3 years ago

johnvanbreda commented 3 years ago

Scenario description

  1. A user registers on system A with email address 1 which creates a warehouse user X.
  2. Later they register on system B but using their new email address 2 which creates a new warehouse user Y.
  3. The user then changes the email address on system A to their new email address 2. The result is 2 different accounts on 2 different systems, with the same email address. On the warehouse the person associated with user X still points to its original email address 1, but email address 2 is also stored in the user_identifiers table to track the change.
  4. Now the user registers on system C, with email address 2. At this point the warehouse cannot determine which account to link to as there are 2 possibilities.

Current solution During registration, if a single user account cannot be found to link to, the user is shown a list of sites they might be a user of with tickboxes to choose the ones they recall. If they choose a single website, then that website's user ID is returned and linked to the Drupal user account first created on system C. If they tick both websites, then the warehouse will delete user X, move it's records to user Y, and return user Y's ID to link to the Drupal user account.

This solution has the following problems:

Alternative solution 1 When registering on system C (step 4), replace the screen showing the list of sites and tickboxes for resolving multiple accounts with a new one. This shows a list of warehouse user accounts, each with history information of the email addresses, plus a list of sites that account is a member of. For example:

This screen has radio buttons alongside each user account so the user can only choose 1, not both. Then the user ID they choose can be linked to the new Drupal account on system C. No merging of accounts or records occurs so there is a lot less risk.

This solution has the following problems:

Alternative solution 2 During step 3, when the user changes the email address on system A to their new email address 2, the system detects the existing account with that email address. They are told that the account exists and if they proceed it will merge the 2 accounts. They are also told to log out and in again for any apps. If they proceed, then the warehouse user X created on system A has it's records for system A moved to user Y and user Y's ID is linked to the Drupal account on system A. Warehouse user X can then be deleted (only if not used on other websites).

Alternative solution 3 This solution can work alongside the original behaviour, or can be used in addition to another solution as it helps to tidy problems that have already happened. In this solution, if the user has a deleted account where the user ID is still linked to a Drupal account (due to a merge on another system as described in the current solution), then the system call the existing get_user_id code to find the correct updated user ID, and will merge any records associated with the deleted user with the correct user. The updated user ID is returned and embedded in the Drupal account.

This solution has the following problems:

Alternative solution 4 Proposed by @kazlauskis - remove the unique constraint on the people.email_address field and allow multiple warehouse users with the same email address (on different websites). Then provide a tool in the user's profile that allows you to link an account with iRecord (or perhaps link an account with another site). This would allow the user to manually control when their accounts are linked.

On either of the websites in the user account page there could be a button to link user account data to another website account. Clicking that button would open an iframe (or redirect) to the other website login page and if the authentication is successful then the warehouse could link the A and B accounts together using their warehouse unique random UIDs. Kind of OAuth when you authorize one site to access account on another.

Proposed way forward:

  1. Implement solution 3, as this will help clean up existing issues and so it will be needed anyway.
  2. As a lower priority task, implement solution 2 as this picks up and resolves the problem at source, rather than later.
  3. As an even lower priority task, change the merge screen which shows a list of possible websites, to show a list of users with email history and website details as described in solution 1. Other changes for solution 1 are not required though. This is low priority because it is just a usability improvement, and once solution 2 is implemented new problem cases should not be created (it will just be occurring where there are existing problems).

@kitenetter @andrewvanbreda @kazlauskis @burkmarr @JimBacon Sorry for the long issue description but I'd appreciate any thoughts before making changes.

JimBacon commented 3 years ago

This would be a great improvement.

The scenario, as your solutions make clear, is that in step 1, a user registers on systems [A1...An] with email address 1, not just a single site. That means with solution 2, when the user changes their address on system A1, user X is not deleted so that the user still has two sets of records, one linked to an old email address.

Should solution 2 actually move all records of user X to user Y and delete user X? This is only supportable with solution 3 in place to ensure subsequent records are not submitted to the deleted user.

Will the email address of a Drupal user be synced wholly, partially, or not at all with the email address of their Indicia user?

johnvanbreda commented 3 years ago

@JimBacon yes, I see what you mean, if there are multiple sites implied by the merge then they should all be merged to avoid duplicate accounts. But , the reason why I didn't go down this route was I felt that doing it one site at a time was clearer for the end user, especially if they've got multiple apps in use on different sites which all need to be signed out and in. E.g. it's obvious why you need to log out of the Fit Count app when you change your email address on the Fit Count website, but it wouldn't be obvious you needed to do this when you change your iRecord email address.

Do you mean when solution 3 kicks in, we should update the Drupal account's email address to the new one? That does sound cleaner, but I'm not sure it feels right to do this without user intervention

johnvanbreda commented 3 years ago

For completeness I've updated the OP with a 4th solution option.

JimBacon commented 3 years ago

Agreed, one site at a time is clearer. Would it be helpful to provide information saying "Merging records for system A1 with B1...Bn. Your records for systems A2...An remain separate and linked to email address 1"

With the Drupal email, I was neither clear how it currently works, nor what your intentions were. Definitely wasn't making a proposal! I just changed my iRecord email address which updated my Indicia User with a new identifier and set the email of my Indicia Person to the new email. If I log in to another Drupal system linked to the same Indicia User ID it won't update the email of that system will it? So there is currently a partial sync. Does it follow that I could have accounts on three different Drupal systems with 3 different emails all linked to the same Indicia User ID but I couldn't easily know which email the warehouse is going to send notifications to? Does that cause any problems?

kitenetter commented 3 years ago

If I'm understanding correctly, the issue that Jim highlights in the previous comment does cause problems. We have examples of verifiers on iRecord who originally registered under one email address, then changed the email in their Drupal account, but still get notifications being sent to the old address.

kazlauskis commented 3 years ago

I wasn't involved in the original discussions around notifications but my first thought is that the notifications logic should live on the websites rather than the warehouse. Websites are where the "user-facing" user accounts live. The notifications sent from the website would know the most up to date user email, in what language to send the notification and whether the user has any communication preferences set.

kitenetter commented 3 years ago

Thanks John for the descriptions of what is clearly a complex process.

I'm still trying to digest all the options, but here are a few comments:

We need to avoid any risk of users accidentally or deliberately taking over accounts, or gaining control over records, that are not theirs.

We need to avoid exposing email addresses to users unless we are sure that the email address in question is, or was previously, linked to that user.

Alternative solution 4 seems promising to me, in that it is flexible in allowing multiple accounts to be linked, while requiring authorisation to do so. I can't quite work out how much of solutions 1, 2, and 3 would be needed in addition to 4 if we went down that route.

I think we do need to provide tools for users to update their own email addresses, and have that feed through to their records and notifications, without us having to intervene, especially if that will resolve problems going forwards. However, for merging existing records or dealing with deleted user accounts I would be happy to have an admin-only tool that I could use, if that would be simpler than developing tools for all users. I'm hoping that there won't be too many such requests, and if the problem will reduce rather than increase over time I think it will be manageable via an admin account.

johnvanbreda commented 3 years ago

@kazlauskis the problem with putting functionality like this on the client side is it's much harder for us to keep it all up to date if it's repeated on hundreds of clients. Also, each of those clients will be repeating queries that could be combined into a single query centrally. The principle being that Drupal is really just a thin-client that handles data entry forms and rendering of report outputs, so all other logic is server-side where possible, where it is easier for us to keep up to date.

@kitenetter there are other reasons why solution 4 is promising, as it is more secure, but it will also be more work.

I think we should proceed with solution 3 though, because that will help to tidy existing problems. Then we can take a longer view on the best approach going forward.

johnvanbreda commented 3 years ago

@kitenetter shall I proceed with solution 3:

In this solution, if the user has a deleted account where the user ID is still linked to a Drupal account (due to a merge on another system as described in the current solution), then the system call the existing get_user_id code to find the correct updated user ID, and will merge any records associated with the deleted user with the correct user. The updated user ID is returned and embedded in the Drupal account.

kitenetter commented 3 years ago

I think so @johnvanbreda although it is not clear to me whether this is an automated process that runs in the background, or whether it is triggered by an admin, or whether there is any interaction with the user.

kitenetter commented 1 year ago

@johnvanbreda does the above solution 3 address (at least some of) the scenarios we were discussing today, or do we need a fresh look at the options for dealing with account merges?

johnvanbreda commented 1 year ago

"Alternative solution 3" sounds like the solution that @JimBacon was describing he has implemented recently. I think it should make the tidy-up operation after a merge easier and more reliable, but the initial merging still needs to be done either through a manual script, option 1 or 2, or the people tidier module.

JimBacon commented 1 year ago

I'd forgotten all about this accountof the issue and didn't refind it as it is not really an iRecord issue.

I raised and fixed https://github.com/Indicia-Team/drupal-8-modules-indicia_features/issues/5 following almost exactly the proposal for solution 3.

Whenever a user logs in to a Drupal site (including when you masquerade as a user) it checks to see if the indicia_user_id stored in Drupal links to a deleted user. If so, it behaves like it is a first login and links to the valid indicia_user_id associated with the email of the Drupal account. It is automatic and no intervention is required.

What I hadn't thought to do was to automatically merge any records made against the deleted account with the new one. The original merge does this and the fix was intended to stop recording against deleted user_ids. It would be a useful addition to repair records that have been made during the lifetime of this bug.

The fix is pushed to the master branch of the repository but has only been deployed to UKBMS so far.

JimBacon commented 1 year ago

"Alternative soulution 2" is the one I suggested in the meeting yesterday as it removes the need for admin intervention. We were saying that a validation of the user owning the new email address would be needed as part of the process. There is a massive Drupal issue, https://www.drupal.org/project/drupal/issues/85494, to add this to core which seems to have been in progress for 16 years.

Alternatively, there are modules for this, though none ideal as not released fully.

andrewvanbreda commented 1 year ago

Hi all, This conversation is a bit long for me to now join and get my head around, but I will chip in with one quick thought to remember with whatever solution you choose. If you are changing Drupal accounts to point at different Warehouse user accounts as appropriate, we'll need to remember to make sure the users_websites table is kept up to date. This is a table that has been neglected in the past, which has caused some issues for the anonymiser.

Sam-Amy commented 6 months ago

This issue has been highlighted again.

A user registered on the Panthon website, and having answeried the system generated questions about existing Indicia accounts, later found that their records had unbeknown to them been automatically moved between their two existing iRecord accounts.

Records moved to the iRecord account linked to the same Indicia ID as the new Pantheon account (ID260896) from a second iRecord account (ID 373602) which had a diffrent email address (and which they do still use). I'm not sure whether this required email verification, but presumably so? I'll add a seperate issue for any changes the user now wants once they have confimed.

Was going to suggest (as @johnvanbreda has via email) that this area of functionality could do with a review, and ask re: prioritisation @kitenetter, and I see a lot of thought has already gone into it.