golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.03k stars 17.54k forks source link

mime: ParseMediaType reports an error on value that is not quoted but contains a space #43721

Closed candeemis closed 2 weeks ago

candeemis commented 3 years ago

What version of Go are you using (go version)?

$ go version 1.15

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/candeemis/Library/Caches/go-build"
GOENV="/Users/candeemis/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/candeemis/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/candeemis/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/candeemis/play-ground/go-mime-parser/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/z9/3q3874k11pd_h45kz05lngr40000gn/T/go-build706623001=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

After spending a few hours on debugging, I found that ParseMediaType function in mime/mediatype.go module throws the following error while parsing media type from the content type as application/pdf; x-unix-mode=0644; name=RG Mellowmessage 4.12.20.pdf error:

Failed to ReadParts: mime: invalid media parameter

ParseMediaType function internally calls another function consumeMediaParam to parse the individual media types for example from the above-mentioned content type x-unix-mode and name (which is the name of the attached file) are two separate types. consumeMediaParam incorrectly parses the media type values if they contain any white space. For example, name contains the white spaces in the above-mentioned content type. So it assigns RG only as of the value to name key and keeps trying to parse the rest. Whereas, in the rest, there isn’t any key separated by = sign. That's why it throws the above-mentioned error.

What did you expect to see?

It should parse RG Mellowmessage 4.12.20.pdf as the value of name key. Which means it should ignore the white space in the value.

What did you see instead?

The following error:

Failed to ReadParts: mime: invalid media parameter

seankhliao commented 3 years ago

based on RFC 2045 Section 5.1

 parameter := attribute "=" value

 attribute := token
              ; Matching of attributes
              ; is ALWAYS case-insensitive.

 value := token / quoted-string

 token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
             or tspecials>

That looks like a malformed header

candeemis commented 3 years ago

@seankhliao But the following doesn't rule out spaces in the value, no? 🤔

value := token / quoted-string

     token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
                 or tspecials>

     tspecials :=  "(" / ")" / "<" / ">" / "@" /
                   "," / ";" / ":" / "\" / <">
                   "/" / "[" / "]" / "?" / "="
                   ; Must be in quoted-string,
                   ; to use within parameter values
seankhliao commented 3 years ago

it could contain spaces, but in a quoted string, not the bare token you demonstrated

dmitshur commented 3 years ago

CC @bradfitz via owners.

dmitshur commented 3 years ago

@candeemis Do you agree that mime.ParseMediaType behavior of reporting an error on that input is correct, in that it matches the specification?

candeemis commented 3 years ago

@dmitshur Well, I can say that the current implementation satisfies the RFC specification. Meanwhile, I still see room for improvement to go one step further a bit different implementation. That would not only satisfy the specification but also ignore minor problems like the one mentioned in the issue. I am working on that concept, but due to time constraint, I may take a bit of time.

kode4food commented 3 years ago

@candeemis Do you mean like observing Postel's Law?

candeemis commented 2 weeks ago

Sorry for the very late reply. But yes, as kode4food mentioned above, we should be a bit easy on the implementation as suggested by the Postal's Law/Robustness Principle

seankhliao commented 2 weeks ago

the robustness principle is now generally considered to be a bad idea, see https://datatracker.ietf.org/doc/html/rfc9413