gen-smtp / gen_smtp

The extensible Erlang SMTP client and server library.
Other
683 stars 265 forks source link

Error with character outside ASCII range in email address #342

Open davidkong opened 3 months ago

davidkong commented 3 months ago

I'm seeing errors when trying to send emails to users with non-ASCII characters in their email address. For instance the Unicode character ้.

The error is showing up here https://github.com/gen-smtp/gen_smtp/blob/da7893dbe5dc20f1d6137141a4ec49f910a7cef6/src/smtp_util.erl#L187 with error 1st argument: not an iodata term.

(We are calling via the Elixir Swoosh library).

adamu commented 3 weeks ago

I'm also seeing this via Swoosh, but then again RFC822 designates that email-addresses must be ascii, so maybe the problem is that combine_rfc822_addresses is being called?

https://github.com/gen-smtp/gen_smtp/blob/685fc92893297d8d726cae1029b9568ae79a0ead/src/mimemail.erl#L992-L1011

adamu commented 3 weeks ago

Here are steps to reproduce in Elixir:

Mix.install [:swoosh, :gen_smtp, :hackney]
mail = %Swoosh.Email{to: [{"", "hello@example.com"}], from: [{"", "test@example.com"}], text_body: "test"}
Swoosh.Adapters.AmazonSES.deliver(mail)
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: not an iodata term

    :erlang.iolist_to_binary([[65352, 65349, 65356, 65356, 65359, 64, 101, 120, 97, 109, 112, 108, 101, 46, 99, 111, 109]])
    (gen_smtp 1.2.0) gen_smtp/src/smtp_util.erl:193: :smtp_util.combine_rfc822_addresses/1
    (gen_smtp 1.2.0) gen_smtp/src/mimemail.erl:963: :mimemail.encode_headers/1
    (gen_smtp 1.2.0) gen_smtp/src/mimemail.erl:964: :mimemail.encode_headers/1
    (gen_smtp 1.2.0) gen_smtp/src/mimemail.erl:210: :mimemail.encode/2
    (swoosh 1.16.12) lib/swoosh/adapters/amazon_ses.ex:226: Swoosh.Adapters.AmazonSES.generate_raw_message_data/2
    (swoosh 1.16.12) lib/swoosh/adapters/amazon_ses.ex:163: Swoosh.Adapters.AmazonSES.prepare_body/2
    iex:3: (file)
mworrell commented 2 weeks ago

RFC822 is quite specific that the address must be using ASCII characters. Special characters (for example quotes) in that range will be escaped.

I see in the code that non-ASCII name parts are accepted and encoded as UTF-8. The code does assume that the local-name is ASCII though.

RFC6531 does allow for unicode local-parts and domain names. A quick reading of the RFC does give me the feeling that for SMTPUTF8 support more needs to be added than just some local-part encoding.

UPDATE I see in the code that we do have SMTPUTF8 support for the server side.

From the tests:

                    ?assertMatch("250 SMTPUTF8" ++ _, Packet34),
                    smtp_socket:send(
                        CSock, <<"MAIL FROM: <испытание@пример.испытание> SMTPUTF8\r\n"/utf8>>
                    ),

So maybe it is less work than I first expected.