catalyst / moodle-auth_saml2

SAML done 100% in Moodle, fast, simple, secure
https://moodle.org/plugins/auth_saml2
70 stars 132 forks source link

Existing user account not used when user has multiple Moodle accounts with same email address #790

Open peterkeijsers opened 7 months ago

peterkeijsers commented 7 months ago

This fixes a bug that ocured when a users tried to login using SAML2. The conditions are:

The condition was count($records == 1) so a user was being found only when there was exactly 1 user account found. This would make the whole reset() call on the next line redundant because there would always be exactly 0 or 1 records...

So I think this needs to be if (count($records) > 0) { so every scenario where the sql query returns multiple users from the Moodle database, the first one will always be used.

danmarsden commented 7 months ago

I'm leaning towards objecting towards this change... If the system has 2 user accounts for the same user - arbitrarily "choosing" one to use is IMO the wrong approach.

Can you provide a bit more information here on the reason your site has 2 active accounts with the same email? - is it an accidental duplication (which IMO should be fixed manually rather than hiding it in auth_saml2) - or is there some other case for multiple accounts with the same email?

peterkeijsers commented 7 months ago

Hi Dan

Thank you for your quick response. I Agree that arbitrarily choosing the first account from the list of records is not ideal. This however was already the logic in your plugin (the line 88) but was never very relevant due to the condition always resulting in only 1 record.

Some context on our case: However in Moodle it is possible through the setting "allowaccountssameemail" to allow for multiple accounts with the same email address. In my case however. The users could (in rare cases) have an LTI account, a manual account and a SAML2 account. These are edge cases however but they are real. And also some users have email addresses with uppercase and lowercase. Depending on the setting "auth_saml2 | tolower" this could also result in possible duplicate accounts. The best solution would be to prevent double accounts. But in our case (and as a better fallback) I think it is better to arbitrarily choose the first available record than to not allow the user to login. Because the user then gets the message that "an account with this email address already excists.". So I would prefer to log the user in with one of the matched email addresses and parallel to that try to minimize and cleanup any duplicate records (manually) by the site admin.

I hope this provides some better insights on my thoughts leading to this PR.

Thanks for looking into this.

danmarsden commented 7 months ago

The way I read that existing code is - get all the user records that match, If we get more than 1 record - something went wrong and we can't match the user so don't let them login. - the reset call just simplifies the array returned into a single record so it can be used in the API call below it. There are probably other ways that the same code could be written but the end result is the same - if we get more than 1 user in the response, we haven't got a unique indentifier so don't log the user in.

My opinion is that existing code provides correct behavior and my vote is to decline this change. If we can't match a unique user with a unique identifier, the correct process should be to not let the user login.

But I'm happy for others to pitch in and disagree and try to convince me otherwise! :-)

brendanheywood commented 7 months ago

The users could (in rare cases) have an LTI account, a manual account and a SAML2 account. These are edge cases however but they are real. And also some users have email addresses with uppercase and lowercase. Depending on the setting "auth_saml2 | tolower" this could also result in possible duplicate accounts.

Before I read this issue I came up with this same scenario and I agree this is valid. What I think is the better middle ground is if the get_user function looks for users in 2 passes: 1) first it looks for an exact match of the user which also matches on the auth = saml2 - if it finds a single account then it uses it and away we go 2) if it can't find an account then it searches more widely for accounts where auth != saml2 AND anyauth is on

If anyauth is off then the second step would not run.

If the first step finds two saml2 users which match then that should still be an error.

So I think we need to move the 'anyauth' out from the saml_login_complete into the user_extractor::get_user function

danmarsden commented 7 months ago

Took me a bit to understand what Brendan was suggesting as an improvement there... (thanks Brendan)

If the user has 2 accounts - eg one account with the auth column in their record set to "saml2" and one account with the auth method set to "manual" and they both have the same email - it should choose the saml one and let the user login.

However - if the user has multiple accounts with the auth method set to "email" or "oauth" or something else - it should just recognise that there are multiple accounts that it can't match, preventing the user from being able to login.

So we should modify the code like he suggests.. First look to see if we can find one user account with auth = "saml2" that matches the email - if we find one record we use it. (if we find multiple saml2 accounts with that email we prevent login).

Then - after the initial search for auth_saml2 accounts, if "anyauth" is enabled, we use similar code to what's there right now - get all users that match the email, if we find one record, log them in, if we find more than one, don't let them login.

Does that make sense and would it meet your needs? - do you think you could modify the pull request to do something similar?

peterkeijsers commented 7 months ago

Hi @danmarsden and @brendanheywood

Thank you for all the thorough answers and critique. I agree that my current pull-request is not sufficient enough and could use some improvements. I made it too quick looking at it now. I think your last suggestion @danmarsden is a good one. That would solve my usecase and improves upon the login logic of this plugin.

I will create a new improved pull request with your suggestions. However due to my limited time the coming weeks it will take some time to complete and update this PR.