Closed candeemis closed 2 weeks ago
based on RFC 2045 Section 5.1
parameter := attribute "=" value attribute := token ; Matching of attributes ; is ALWAYS case-insensitive. value := token / quoted-string token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, or tspecials>
That looks like a malformed header
@seankhliao But the following doesn't rule out spaces in the value, no? 🤔
value := token / quoted-string
token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
or tspecials>
tspecials := "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / <">
"/" / "[" / "]" / "?" / "="
; Must be in quoted-string,
; to use within parameter values
it could contain spaces, but in a quoted string, not the bare token you demonstrated
@candeemis Do you agree that mime.ParseMediaType
behavior of reporting an error on that input is correct, in that it matches the specification?
@dmitshur Well, I can say that the current implementation satisfies the RFC specification. Meanwhile, I still see room for improvement to go one step further a bit different implementation. That would not only satisfy the specification but also ignore minor problems like the one mentioned in the issue. I am working on that concept, but due to time constraint, I may take a bit of time.
@candeemis Do you mean like observing Postel's Law?
Sorry for the very late reply. But yes, as kode4food mentioned above, we should be a bit easy on the implementation as suggested by the Postal's Law/Robustness Principle
the robustness principle is now generally considered to be a bad idea, see https://datatracker.ietf.org/doc/html/rfc9413
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?What did you do?
After spending a few hours on debugging, I found that
ParseMediaType
function inmime/mediatype.go
module throws the following error while parsing media type from the content type asapplication/pdf; x-unix-mode=0644; name=RG Mellowmessage 4.12.20.pdf
error:ParseMediaType
function internally calls another functionconsumeMediaParam
to parse the individual media types for example from the above-mentioned content typex-unix-mode
andname
(which is the name of the attached file) are two separate types.consumeMediaParam
incorrectly parses the media type values if they contain any white space. For example,name
contains the white spaces in the above-mentioned content type. So it assignsRG
only as of the value toname
key and keeps trying to parse the rest. Whereas, in the rest, there isn’t any key separated by=
sign. That's why it throws the above-mentioned error.What did you expect to see?
It should parse
RG Mellowmessage 4.12.20.pdf
as the value ofname
key. Which means it should ignore the white space in the value.What did you see instead?
The following error: