golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.03k stars 17.54k forks source link

mime: FormatMediaType fails to detect non-ASCII bytes #28849

Closed andrius4669 closed 5 years ago

andrius4669 commented 5 years ago

What version of Go are you using (go version)?

go version go1.11.2 linux/amd64

Does this issue reproduce with the latest release?

yeah.

What operating system and processor architecture are you using (go env)?

irrelevant.

What did you do?

https://play.golang.org/p/mkEHhue4cKm Tried FormatMediaType with different unicode characters in parameter values.

What did you expect to see?

One of:

  1. FormatMediaType encoding parameters with any unicode characters as per RFC 2231 section 4.
  2. FormatMediaType consistently accepting all unicode characters. That's probably not really compliant with current standards, but works with few email clients I tried. I like RFC 6532, but it doesn't have any mention of Content-Type or Content-Disposition though, so this option can probably be ignored.
  3. FormatMediaType consistently rejecting all characters impossible to represent by US-ASCII.

What did you see instead?

Only characters covered by ISO-8859-1 are detected as unicode and cause encoding failure. Characters whose rune values (actual unicode values, not UTF-8 encoded) aren't & 0x80 pass test and are included in output.

Additional info & reporter's opinion

I noticed this looking at source code of FormatMediaType and noticing that it contains if character&0x80 != 0 check, however it iterates thru string with range value (not range []byte(value) which would be correct). I really don't like that it doesn't at least support RFC 2231 encoding, now I will either need to reimplement this function myself, or encode filename parameter using either mime.BEncoding or mime.QEncoding. Adding proper check there and failing to encode could cause some issues if implementations relied on old broken behavior, I guess.

bcmills commented 5 years ago

This issue seems like it is really two separate issues:

  1. The current validation checks runes instead of bytes.
  2. Ideally you'd like RFC 2231 encoding.

(1) is a bug-fix. (2) is a feature request; please file it separately.

bcmills commented 5 years ago

CC @mpvl @bradfitz

andrius4669 commented 5 years ago

fix for 1 seems easy, either change range to use []byte() or change check to compare >= 0x80. I guess you're right about 2 being kind of separate issue, I will probably fill feature request for that. I've already changed my code to accomodate FormatMediaType failing and retrying with parameter encoded with mime.BEncoding, which works okay with email clients I tested with.

On 2018 11 19 16:05:08 UTC, "Bryan C. Mills" notifications@github.com wrote:

This issue seems like it is really two separate issues:

  1. The current validation checks runes instead of bytes.
  2. Ideally you'd like RFC 2231 encoding.

(1) is a bug-fix. (2) is a feature request; please file it separately.

gopherbot commented 5 years ago

Change https://golang.org/cl/150417 mentions this issue: mime: correctly detect non-ASCII characters in FormatMediaType

gopherbot commented 5 years ago

Change https://golang.org/cl/150498 mentions this issue: mime: remove allocation introduced in recent fix