Closed andrius4669 closed 5 years ago
This issue seems like it is really two separate issues:
(1) is a bug-fix. (2) is a feature request; please file it separately.
CC @mpvl @bradfitz
fix for 1 seems easy, either change range to use []byte() or change check to compare >= 0x80. I guess you're right about 2 being kind of separate issue, I will probably fill feature request for that. I've already changed my code to accomodate FormatMediaType failing and retrying with parameter encoded with mime.BEncoding, which works okay with email clients I tested with.
On 2018 11 19 16:05:08 UTC, "Bryan C. Mills" notifications@github.com wrote:
This issue seems like it is really two separate issues:
- The current validation checks runes instead of bytes.
- Ideally you'd like RFC 2231 encoding.
(1) is a bug-fix. (2) is a feature request; please file it separately.
Change https://golang.org/cl/150417 mentions this issue: mime: correctly detect non-ASCII characters in FormatMediaType
Change https://golang.org/cl/150498 mentions this issue: mime: remove allocation introduced in recent fix
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yeah.
What operating system and processor architecture are you using (
go env
)?irrelevant.
What did you do?
https://play.golang.org/p/mkEHhue4cKm Tried
FormatMediaType
with different unicode characters in parameter values.What did you expect to see?
One of:
FormatMediaType
encoding parameters with any unicode characters as per RFC 2231 section 4.FormatMediaType
consistently accepting all unicode characters. That's probably not really compliant with current standards, but works with few email clients I tried. I like RFC 6532, but it doesn't have any mention ofContent-Type
orContent-Disposition
though, so this option can probably be ignored.FormatMediaType
consistently rejecting all characters impossible to represent by US-ASCII.What did you see instead?
Only characters covered by ISO-8859-1 are detected as unicode and cause encoding failure. Characters whose rune values (actual unicode values, not UTF-8 encoded) aren't
& 0x80
pass test and are included in output.Additional info & reporter's opinion
I noticed this looking at source code of
FormatMediaType
and noticing that it containsif character&0x80 != 0
check, however it iterates thru string withrange value
(notrange []byte(value)
which would be correct). I really don't like that it doesn't at least support RFC 2231 encoding, now I will either need to reimplement this function myself, or encode filename parameter using eithermime.BEncoding
ormime.QEncoding
. Adding proper check there and failing to encode could cause some issues if implementations relied on old broken behavior, I guess.