simple-login / app

The SimpleLogin back-end and web app
https://simplelogin.io
GNU Affero General Public License v3.0
5.04k stars 421 forks source link

Email address with different capitalization are not applied to the same alias #1562

Open shawnli87 opened 1 year ago

shawnli87 commented 1 year ago

Bug report

Describe the bug When incoming mail has an address that matches an existing alias, except the sender uses different capitalization, new alias is created.

Expected behavior As email addresses are not case sensitive, the existing alias should be used.

Screenshots Screen Shot 2023-02-03 at 19 43 32

Environment (If applicable): N/A

Additional context None

jakob11git commented 1 year ago

As email addresses are not case sensitive

There is no rule inside any email standards that mandates this. Indeed, in RFC 5321, 2.4 it actually says:

SMTP implementations MUST take care to preserve the case of mailbox local-parts. In particular, for some hosts, the user "smith" is different from the user "Smith".

In general it is a good idea to canonicalise email addresses, but I don't think you should mess with the local-part.

shawnli87 commented 1 year ago

As email addresses are not case sensitive

I probably should have avoided making a gross generalization here.

There is no rule inside any email standards that mandates this. Indeed, in RFC 5321, 2.4 it actually says:

SMTP implementations MUST take care to preserve the case of mailbox local-parts. In particular, for some hosts, the user "smith" is different from the user "Smith".

Also from RFC 5321, 2.4, the following sentences state:

In particular, for some hosts, the user "smith" is different from the user "Smith". However, exploiting the case sensitivity of mailbox local-parts impedes interoperability and is discouraged.

And in RFC 5321, 4.1.2, it says:

...a host that expects to receive mail SHOULD avoid defining mailboxes ... where the Local-part is case-sensitive.

As is suggested for interoperability, in practice the vast majority of mail servers do not distinguish mailboxes based on case.

The problem arises in two situations:

  1. A sender creates a reverse-alias and for an outgoing message and the receiver replies with a address using a different case
  2. A user is communicating with a department within a company, and the representatives have their email clients programmed with different cases for the same address

Most modern email clients utilize a conversation mode to organize message. When separate aliases are created, the conversations can be hard to follow.

The question is, should preserving a rarely used standard be prioritized over user friendliness?

As an aside, a solution to the problem could be giving the user an option to switch on/off local-part case-sensitivity as a default at the account level and per contact as well.

shawnli87 commented 1 year ago

Also when creating a reverse-alias currently, the local-part is canonicalised to lower case automatically. There definitely should not be conflicting programming where local-part case sensitivity is respected in one aspect, but not in another.

jakob11git commented 1 year ago

Good points. Whatever SimpleLogin decides to do it should:

So either way there is room for improvement.

nguyenkims commented 1 year ago

This is actually intended as you can see in the commit https://github.com/simple-login/app/commit/fdfa286

shawnli87 commented 1 year ago

This is actually intended as you can see in the commit fdfa286

I contend that it is not quite working as intended as:

  1. It forces, rather than allows, incoming email addresses to be case-sensitive.
  2. Manually created contacts are forced to be lower case.
jakob11git commented 1 year ago

To really solve this issue comprehensively without breaking anything, as far as I can see it would require to be able to have different contact email addresses under a single reverse-alias (with one address being selected as the "main" address that would be used for outgoing mail).

Because the issue that you're describing with your MUA for sure happens because of different capitalization, but it would also happen if people that use an email server that behaves like Gmail set up firstlast@gmail.com on one computer and first.last@gmail.com on the other. Or to give a less likely but equally valid example: f.iR.s.tLaS.t+mail@gmail.com. If there's a requirement for having a setting in SimpleLogin that allows to correctly group them all together into one reverse-alias so your MUA behaves as expected, naive automatic canonicalisation doesn't cut it.

shawnli87 commented 1 year ago

To really solve this issue comprehensively without breaking anything, as far as I can see it would require to be able to have different contact email addresses under a single reverse-alias (with one address being selected as the "main" address that would be used for outgoing mail).

Because the issue that you're describing with your MUA for sure happens because of different capitalization, but it would also happen if people that use an email server that behaves like Gmail set up firstlast@gmail.com on one computer and first.last@gmail.com on the other. Or to give a less likely but equally valid example: f.iR.s.tLaS.t+mail@gmail.com. If there's a requirement for having a setting in SimpleLogin that allows to correctly group them all together into one reverse-alias so your MUA behaves as expected, naive automatic canonicalisation doesn't cut it.

Forgive me, as I am a self-taught programming hobbyist who just recently started pursuing a degree in computer science. However, it appears to me that native automatic canonicalization was the behavior prior to commit https://github.com/simple-login/app/commit/fdfa286. This fix seems to have unintended complications and, in my humble opinion, ought to be revisited. To me, canonicalization seems the lesser of two evils, albeit not the best option.

I actually considered suggesting a reverse-alias grouping function, but as I am not yet proficient in Python or SMTP, I can’t quite determine the difficulty in implementation. Perhaps, if there are no changes in 6 months, I can revisit the issue.

nguyenkims commented 1 year ago

The reason that https://github.com/simple-login/app/commit/fdfa286 was created is because some addresses are hard to read without the uppercase, for example CustumerSupport@company.com vs customersupport@company.com. A sender usually keep the same format so it isn't an issue for new senders. The case you mentioned is probably due to a sender that was created before that change and now with the change, a different sender is now created.

Uppercase vs lowercase is a never ending topic in email world and actual implementation can differ from RFC ... We have picked the easy solution of supporting uppercase when we don't have control and canonicalise by default everywhere we can. On the highsight, a better solution was maybe to add a new field called "canonicalised_address" for the Contact table that's used to avoid contact duplication.