httpwg / httpbis-issues

1 stars 1 forks source link

Character Encodings in TEXT #74

Closed mnot closed 3 years ago

mnot commented 16 years ago

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to:

See also #63, #111.

Reported by @mnot, migrated from https://trac.ietf.org/trac/httpbis/ticket/74

mnot commented 16 years ago
mnot commented 16 years ago

There was discussion of this at the APPS Area Architecture Workshop, with some disagreement as to whether it's possible to encode IRI->URI->IRI. Specific advice to IRIs may be necessary.

mnot commented 16 years ago

julian.reschke@gmx.de commented:

Replying to [comment:2 mnot@pobox.com]:

There was discussion of this at the APPS Area Architecture Workshop, with some disagreement as to whether it's possible to encode IRI->URI->IRI. Specific advice to IRIs may be necessary.

Is this about round-tripping IRIs through URIs? Obviously that's not possible.

For example, consider the two IRIs:

I1: http://www.example.org/Dürst

I2: http://www.example.org/D%C3%BCrst

Both would be converted to the URI:

U: http://www.example.org/D%C3%BCrst

Now whether that disctinction is relevant of course depends on which kind of URI/IRI comparison is needed; but there are cases where it is relevant (for instance, XML namespace names using IRIs (urg!)).

(see also http://tools.ietf.org/html/rfc3987#section-3.2.1)

mnot commented 16 years ago

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

to:

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to determining whether UTF-8 can be allowed in some way; see also #63,

mnot commented 16 years ago

@mnot changed description from:

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to determining whether UTF-8 can be allowed in some way; see also #63,

to:

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to:

  • determining whether UTF-8 can be allowed in some way (e.g., in current uses of TEXT, and/or new headers), and
  • possibly tightening up use of iso-8859-1 (in particular, C1 controls).

See also #63, #111.

mnot commented 16 years ago

@mnot changed description from:

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to:

  • determining whether UTF-8 can be allowed in some way (e.g., in current uses of TEXT, and/or new headers), and
  • possibly tightening up use of iso-8859-1 (in particular, C1 controls).

See also #63, #111.

to:

RFC 2616 prescribes that headers containing non-ASCII have to use either iso-8859-1 or RFC 2047. This is unnecessarily complex and not necessarily followed by implementations or by specifications of new headers.

This issue is limited to:

  • determining whether UTF-8 can be allowed in some way (e.g., in current uses of TEXT, and/or new headers), and
  • possibly tightening up use of iso-8859-1 in TEXT (in particular, C1 controls).

See also #63, #111.

mnot commented 16 years ago

@mnot changed milestone from unassigned to 06

mnot commented 16 years ago

fielding@gbiv.com commented:

From 395:

Deprecate line folding, addresses #77. Require that invalid whitespace around field-names be rejected, addresses #30. Make non-ASCII content obsolete and opaque in header fields and reason phrase, addresses #63, #74, #94, #111.

mnot commented 16 years ago

Fixed in 398:

Resolve #63, #74, #94, #111: Issues around TEXT rule closed with revision 395 (closes #63, #74, #94, #111)

mnot commented 16 years ago

re-open until reviewed

mnot commented 16 years ago

Part 6 still allows RFC2047 encoding for the Warn header.

mnot commented 16 years ago