golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.4k stars 17.7k forks source link

net/http: ServeContent If-None-Match will always NOT match #41536

Closed chrispassas closed 4 years ago

chrispassas commented 4 years ago

What version of Go are you using (go version)?

$ go version
go version go1.15.2 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/Users/cpassas/go/bin"
GOCACHE="/Users/cpassas/Library/Caches/go-build"
GOENV="/Users/cpassas/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/cpassas/go/pkg/mod"
GOOS="darwin"
GOPATH="/Users/cpassas/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/kc/5pq72vhs41zd8y7dvq8hk_hm0000gn/T/go-build495563477=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.15.2 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.15.2
uname -v: Darwin Kernel Version 19.6.0: Thu Jun 18 20:49:00 PDT 2020; root:xnu-6153.141.1~1/RELEASE_X86_64
ProductName:    Mac OS X
ProductVersion: 10.15.6
BuildVersion:   19G2021
lldb --version: lldb-1200.0.32.1
Apple Swift version 5.3 (swiftlang-1200.0.29.2 clang-1200.0.30.1)
gdb --version: GNU gdb (GDB) 8.0.1

What did you do?

Making a curl passing If-None-Match does not work as documented

Do to a bug If-None-Match will always not match. This makes the If-None-Match check useless in http.ServeContent()

Example Curl

curl -v  -H "If-None-Match: 8bf04cd320950949ce95d3a08bdd45e6" "https://foo.com/filetodownload.gz"

https://golang.org/pkg/net/http/#ServeContent

Documentation claims this function supports If-Match, If-None-Match, If-Range

If the caller has set w's ETag header formatted per RFC 7232, section 2.3, ServeContent uses it to handle requests using If-Match, If-None-Match, or If-Range.

Reviewing the code it appears this is not the case.

https://golang.org/src/net/http/fs.go?s=5158:5262#L145

 func ServeContent(w ResponseWriter, req *Request, name string, modtime time.Time, content io.ReadSeeker) {
    sizeFunc := func() (int64, error) {
        size, err := content.Seek(0, io.SeekEnd)
        if err != nil {
            return 0, errSeeker
        }
        _, err = content.Seek(0, io.SeekStart)
        if err != nil {
            return 0, errSeeker
        }
        return size, nil
    }
    serveContent(w, req, name, modtime, sizeFunc, content)
  }

func ServeContent calls serveContent

func serveContent(w ResponseWriter, r *Request, name string, modtime time.Time, sizeFunc func() (int64, error), content io.ReadSeeker) {
    setLastModified(w, modtime)
    done, rangeReq := checkPreconditions(w, r, modtime)
    if done {
        return
    }

    code := StatusOK

    // If Content-Type isn't set, use the file's extension to find it, but
    // if the Content-Type is unset explicitly, do not sniff the type.
    ctypes, haveType := w.Header()["Content-Type"]
    var ctype string
    if !haveType {
        ctype = mime.TypeByExtension(filepath.Ext(name))
        if ctype == "" {
            // read a chunk to decide between utf-8 text and binary
            var buf [sniffLen]byte
            n, _ := io.ReadFull(content, buf[:])
            ctype = DetectContentType(buf[:n])
            _, err := content.Seek(0, io.SeekStart) // rewind to output whole file
            if err != nil {
                Error(w, "seeker can't seek", StatusInternalServerError)
                return
            }
        }
        w.Header().Set("Content-Type", ctype)
    } else if len(ctypes) > 0 {
        ctype = ctypes[0]
    }

    size, err := sizeFunc()
    if err != nil {
        Error(w, err.Error(), StatusInternalServerError)
        return
    }

    // handle Content-Range header.
    sendSize := size
    var sendContent io.Reader = content
    if size >= 0 {
        ranges, err := parseRange(rangeReq, size)
        if err != nil {
            if err == errNoOverlap {
                w.Header().Set("Content-Range", fmt.Sprintf("bytes */%d", size))
            }
            Error(w, err.Error(), StatusRequestedRangeNotSatisfiable)
            return
        }
        if sumRangesSize(ranges) > size {
            // The total number of bytes in all the ranges
            // is larger than the size of the file by
            // itself, so this is probably an attack, or a
            // dumb client. Ignore the range request.
            ranges = nil
        }
        switch {
        case len(ranges) == 1:
            // RFC 7233, Section 4.1:
            // "If a single part is being transferred, the server
            // generating the 206 response MUST generate a
            // Content-Range header field, describing what range
            // of the selected representation is enclosed, and a
            // payload consisting of the range.
            // ...
            // A server MUST NOT generate a multipart response to
            // a request for a single range, since a client that
            // does not request multiple parts might not support
            // multipart responses."
            ra := ranges[0]
            if _, err := content.Seek(ra.start, io.SeekStart); err != nil {
                Error(w, err.Error(), StatusRequestedRangeNotSatisfiable)
                return
            }
            sendSize = ra.length
            code = StatusPartialContent
            w.Header().Set("Content-Range", ra.contentRange(size))
        case len(ranges) > 1:
            sendSize = rangesMIMESize(ranges, ctype, size)
            code = StatusPartialContent

            pr, pw := io.Pipe()
            mw := multipart.NewWriter(pw)
            w.Header().Set("Content-Type", "multipart/byteranges; boundary="+mw.Boundary())
            sendContent = pr
            defer pr.Close() // cause writing goroutine to fail and exit if CopyN doesn't finish.
            go func() {
                for _, ra := range ranges {
                    part, err := mw.CreatePart(ra.mimeHeader(ctype, size))
                    if err != nil {
                        pw.CloseWithError(err)
                        return
                    }
                    if _, err := content.Seek(ra.start, io.SeekStart); err != nil {
                        pw.CloseWithError(err)
                        return
                    }
                    if _, err := io.CopyN(part, content, ra.length); err != nil {
                        pw.CloseWithError(err)
                        return
                    }
                }
                mw.Close()
                pw.Close()
            }()
        }

        w.Header().Set("Accept-Ranges", "bytes")
        if w.Header().Get("Content-Encoding") == "" {
            w.Header().Set("Content-Length", strconv.FormatInt(sendSize, 10))
        }
    }

    w.WriteHeader(code)

    if r.Method != "HEAD" {
        io.CopyN(w, sendContent, sendSize)
    }
}

the third line of the function calls

done, rangeReq := checkPreconditions(w, r, modtime)

checkPreconditions() then calls checkIfNoneMatch()

checkIfNoneMatch() at this point the 'w ResponseWriter' has no reference to the actual file so it's not possible to read the tag of the file. It will always be empty string ''.

You see the serveContent() last input parameter is content io.ReadSeeker and the w ResponseWriter does not have any reference to it prior to calling checkIfNoneMatch() so its not possible to get the etag of the file to compare.

func checkIfNoneMatch(w ResponseWriter, r *Request) condResult {
    inm := r.Header.get("If-None-Match")
    if inm == "" {
        return condNone
    }
    buf := inm
    for {
        buf = textproto.TrimString(buf)
        if len(buf) == 0 {
            break
        }
        if buf[0] == ',' {
            buf = buf[1:]
            continue
        }
        if buf[0] == '*' {
            return condFalse
        }
        etag, remain := scanETag(buf)
        if etag == "" {
            break
        }
        if etagWeakMatch(etag, w.Header().get("Etag")) {
            return condFalse
        }
        buf = remain
    }
    return condTrue
}

What did you expect to see?

If the file has not been modified and the etag matches a 304 (Not Modified) should be returned.

What did you see instead?

Instead ServeContent() returns the file because of the internal etag checking bug.

chrispassas commented 4 years ago

It looks like this bug goes all the way back to when ServeContent was added 2/9/2012, 6:02:06 PM by @bradfitz

chrispassas commented 4 years ago

On further review I've discovered the issue. In my own example the etag was being sent back from the server without double quotes around it.

It appears the Go http package will not work without the double quotes. I believe this is in spec for the RFC so I'm closing the issue.

For what its worth I think it would make sense for Go to support etags with/without double quotes.

davecheney commented 4 years ago

For what its worth I think it would make sense for Go to support etags with/without double quotes.

@chrispassas please consider opening a new issue to discuss this.