ietf-wg-emailcore / emailcore

3 stars 0 forks source link

Advice against using URL %-encoding on non ASCII email addresses to create ASCII version of them #78

Open aamelnikov opened 2 years ago

aamelnikov commented 2 years ago

John Klensin wrote:

Also, given [mis]behavior seen in the wild, does that paragraph (or an A/S) need
an explicit caution about SMTP servers or clients assuming they can
apply the popular web convention of using %NN sequences as a way
to encode non-ASCII characters (<pct-encoded> in RFC 3986) and
assuming some later system will interpret it as they expect?
toddherr commented 1 year ago

Levine to provide note re: punycode to expand text in 4.2 in A/S

Ask on list if section 4.2 in the A/S addresses this

toddherr commented 12 months ago

From Levine...

Current text:

4.2. Use of non-ASCII Characters

Proper generation and transmission of email addresses containing non-ASCII characters is discussed in [RFC6530]. SMTP clients and servers that attempt to use the popular web convention of Percent-Encoding non-ASCII characters (see Section 2.1 of [RFC3986]) SHALL NOT assume that a downstream system will interpret the email address accordingly without prior knowledge.

Proposed text to replace the above:

Proper generation and transmission of email addresses containing non-ASCII characters is discussed in [RFC6530]. Section 9 of [RFC6530] says: "a downgrade mechanism that transforms the local part of an email address cannot be utilized in transit." Hence SMTP clients and servers MUST NOT try to encode non-ASCII email addresses as ASCII addresses. In particular, they MUST NOT use web URI percent encoding (see Section 2.1 of [RFC3986]) nor Internationalized Domain Names for Applications punycode (see section 4.4 of [RFC5891]) since neither will produce a valid address.

In some cases, servers or clients may be able to use local knowledge to substitute ASCII addresses for specific non-ASCII addresses, but that is beyond the scope of this memo. See Section 8 of [RFC6530] for further discussion.

toddherr commented 11 months ago

Final text, per thread started here - https://mailarchive.ietf.org/arch/msg/emailcore/1qJu66_6bCMuO5HtQPWj8Iw4dAk/

Proper generation and transmission of email addresses containing non-ASCII
characters is discussed in [RFC6530]. Section 9 of [RFC6530] says: "a
downgrade mechanism that transforms the local part of an email address
cannot be utilized in transit." Hence SMTP clients and servers MUST NOT
try to encode non-ASCII email addresses as ASCII addresses. In particular, 
they MUST NOT use web URI percent encoding (see Section 2.1 of [RFC3986]) 
nor Internationalized Domain Names for Applications (IDNA)  (see section 4.4 
of [RFC5891]) Punycode [RFC3492] in the local-part of an address, nor the 
former in the domain-part, since neither will produce a valid address.

In some cases, servers or clients may be able to use local knowledge to
substitute ASCII addresses for specific non-ASCII addresses, but that is
beyond the scope of this memo. See Section 8 of [RFC6530] for further
discussion.
ksmurchison commented 5 months ago

Applied the above text to draft -10.

ksmurchison commented 1 month ago

Applied the following text, as discussed on the list, to draft -11

4.2.  Use of non-ASCII Characters

   Proper generation and transmission of email addresses containing non-
   ASCII characters is discussed in [RFC6530].  Section 9 of [RFC6530]
   says: "a downgrade mechanism that transforms the local part of an
   email address cannot be utilized in transit."  This is actually just
   a special case of a principle, discussed in Section 2.3.11 of
   [I-D.ietf-emailcore-rfc5321bis] and elsewhere, that nothing other
   than the final delivery system should attempt to interpret or alter
   the local-part of an address.  In particular, they MUST NOT:

   *  use web URI percent encoding (see Section 2.1 of [RFC3986]) in
      either the local-part or the domain-part of an address

   *  perform Internationalized Domain Names for Applications (IDNA)
      Punycode Converstion (see Section 4.4 of [RFC5891]) on the domain-
      part of an address

   since none of these encodings will produce an address that is
   guaranteed to be treated as equivalent to the original one.

   In some cases, servers or clients may be able to use local knowledge
   to substitute ASCII addresses for specific non-ASCII addresses, but
   that is beyond the scope of this memo.  See Section 8 of [RFC6530]
   for further discussion.
aamelnikov commented 1 month ago

Looks good. Note a typo: Converstion --> Conversion

ksmurchison commented 2 weeks ago

The folks in the room at IETF 121 were happy with the new next in -12 with the exception of a typo that has been fixed in -13