bbottema / simple-java-mail

Simple API, Complex Emails (Jakarta Mail smtp wrapper)
http://www.simplejavamail.org
Apache License 2.0
1.23k stars 269 forks source link

IDN - Internationalized domain name for email address (ASCII conversion) #463

Closed JamesBoon closed 1 year ago

JamesBoon commented 1 year ago

Hi, thank you for this great library!

I must send emails to recipients with UTF-8 characters as part of the domain name. E.g. "you@exämple.com". As I am using Postfix to send my mails, which does no automatic conversion to the ASCII (xn--mumble) form, I thought maybe SimpleJavaMail could/would do this.

I've been searching the docs and issues, but could not find a clear answer. Is there a config for this? Or do I have to implement it myself?

Some code:

Email email = EmailBuilder.startingBlank()
        .withSubject("Test subject")
        .from("me@example.com")
        .to("you@exämple.com") // Expected: "To: you@xn--exmple-cua.com"
        .withHTMLText("<b>Bold html text!</b>")
        .buildEmail();

Mailer mailer = MailerBuilder
        .withTransportModeLoggingOnly()
        .buildMailer();

mailer.sendMail(email);
RohanNagar commented 1 year ago

Just chiming in here, it is pretty easy to convert the domain to ASCII yourself in Java if it is not an option in SimpleJavaMail:

String convertedDomain = IDN.toASCII(domain, IDN.ALLOW_UNASSIGNED);
JamesBoon commented 1 year ago

Hi @RohanNagar, thank you very much.

Does something like the following lines make sense?

com.sanctionco.jmail.Email rawTo = JMail.validator().tryParse("you@exämple.com").get();

String to = rawTo.localPartWithoutComments() + "@" +
        rawTo.domainParts().stream()
                .map(part -> IDN.toASCII(part, IDN.ALLOW_UNASSIGNED))
                .collect(Collectors.joining("."));
// ...

btw. thank you for JMail :smiley:

If I may, I would like to ask you one more question: why IDN.ALLOW_UNASSIGNED?

RohanNagar commented 1 year ago

Your example should work! I think you could make it even simpler if you want:

com.sanctionco.jmail.Email rawTo = JMail.tryParse("you@exämple.com").get();

String to = rawTo.localPartWithoutComments() + "@" +
        IDN.toASCII(rawTo.domainWithoutComments(), IDN.ALLOW_UNASSIGNED));

// ...

I started using IDN.ALLOW_UNASSIGNED once I realized that emojis could be included in the domain. If you know that your domain will not contain emojis you could probably leave that out.

bbottema commented 1 year ago

Hey guys, thanks for reaching out. I'm wondering now, is this something Simple Java Mail should take care of internally, and, is it safe to make those assumptions in all cases. Or is this really something that's up to the users?

JamesBoon commented 1 year ago

Hi @RohanNagar, thank you, that is just perfect :smile: Didn't think it could be so easy.

Hi @bbottema, I think it would be super helpful if Simple Java Mail has at least an option to easily switch this transformation on (thus having no breaking change).

RohanNagar commented 1 year ago

@bbottema Internationalized domain names are valid as of RFC 6530. I believe it is supposed to be the mail server's responsibility to map the IDN to ASCII, but I'm sure there are many legacy servers not doing that.

I think it would probably be ok to make the assumption to convert to ASCII in all cases, but I can't be 100% sure.

Wikipedia has a high level explanation: https://en.wikipedia.org/wiki/Email_address#Internationalization

JamesBoon commented 1 year ago

I am far from being an expert for IDN, but what I have read in the documents, there are two specifications: the old "IDNA2003" and the current "IDNA2008". They treat a few characters very differently. See https://unicode.org/faq/idn.html

When testing the example given on that page, the java function IDN.toASCII returns the old "IDNA2003" version (tested with Java 19):

System.out.println(IDN.toASCII("faß.de", IDN.ALLOW_UNASSIGNED));
// Result: "fass.de"
// Expected: "xn--fa-hia.de"

If you have been using Simple Java Mail with a service like gmail, it will do the conversion for you and will probably do it correctly. So according to the current specification.

That is why I think it would be a breaking change if Simple Java Mail will automatically convert all domain names.

JamesBoon commented 1 year ago

This IDN problem is a lot more complicated than I thought. There are only a very little amount of libraries that actually handle them correctly using the current "IDNA2008" standard.

I have found some very interesting readings at: https://community.icann.org/display/TUA/UA+Training+Materials

The one library that was referred to as being "The gold standard library for Unicode" is: https://mvnrepository.com/artifact/com.ibm.icu/icu4j/73.1

Using it with the example "faß.de" above, it returns the expected result:

import com.ibm.icu.text.IDNA;

IDNA validator = IDNA.getUTS46Instance(
        IDNA.NONTRANSITIONAL_TO_ASCII
                | IDNA.NONTRANSITIONAL_TO_UNICODE
                | IDNA.CHECK_BIDI
                | IDNA.CHECK_CONTEXTJ
                | IDNA.CHECK_CONTEXTO
                | IDNA.USE_STD3_RULES);

IDNA.Info info = new IDNA.Info();
StringBuilder output = new StringBuilder();

validator.nameToASCII("faß.de", output, info);

System.out.println(output);
// Result: "xn--fa-hia.de"

I do not know what this means for Simple Java Mail or JMail. I just wanted to share my findings and hope it helps.

bbottema commented 1 year ago

I think this is beyond me, and the beyond the scope of Simple Java Mail. I'm going to close this as won't fix, unless someone has a very good argument to the contrary. Thanks for looking into this!

JamesBoon commented 1 year ago

Closing this is perfectly fine with me.
However, I have a question, would it be ok for you to add a note to the documentation that the handling of international domain names may have to be implemented by yourself? That would have helped me a lot in the first place :smiley:

bbottema commented 1 year ago

Where exactly? I would be happy to accept a pull request too, btw

JamesBoon commented 1 year ago

I'd love to contribute! But I'm not sure where the note will fit.

JamesBoon commented 1 year ago

Sigh. I give up. I couldn't find a good place to add such a note to the docs. Maybe it is enough that this issue exists and has enough resources to get someone started.

Thank you @bbottema and @RohanNagar for your time and help!

JamesBoon commented 1 year ago

If anyone is interested, I've created a gist to play around with the email domain names: IdnEmailExample.java

RohanNagar commented 1 year ago

@JamesBoon thanks so much for digging into this! I'm actually very interested in implementing this better in JMail and have opened an issue to implement this (linked just above this comment).