google / go-cloud

The Go Cloud Development Kit (Go CDK): A library and tools for open cloud development in Go.
https://gocloud.dev/
Apache License 2.0
9.57k stars 812 forks source link

blob: `Bucket.Upload()` does not compute md5 #3477

Closed myaaaaaaaaa closed 3 months ago

myaaaaaaaaa commented 3 months ago

Describe the bug

Upload() doesn't actually write any data to its internal checksummer, so any attempt to use WriterOptions.ContentMD5 will result in a complaint that the supplied checksum doesn't match md5("")

To Reproduce

package main

import (
    "bytes"
    "context"
    "crypto/md5"
    "io"
    "testing"

    "gocloud.dev/blob"
    "gocloud.dev/blob/memblob"
)

func TestMD5(t *testing.T) {
    bucket := memblob.OpenBucket(nil)
    defer bucket.Close()

    var buf bytes.Buffer
    buf.WriteString("foobar")

    actualMD5 := md5.Sum(buf.Bytes())
    emptyMD5 := md5.Sum([]byte{})

    err := bucket.Upload(context.Background(), "foo", &buf, &blob.WriterOptions{
        ContentType: "text/plain",
        ContentMD5:  actualMD5[:], // Comment this line out to see the stored checksum, which is also equal to emptyMD5
    })
    if err != nil {
        t.Error(err)
    }

    iter := bucket.List(nil)
    for {
        obj, err := iter.Next(context.Background())
        if err == io.EOF {
            break
        }
        if err != nil {
            t.Error(err)
        }
        t.Logf("%X - stored md5 of %s", obj.MD5, obj.Key)
    }

    t.Logf("%X - actual md5", actualMD5)
    t.Logf("%X - empty md5", emptyMD5)
}

Output:

=== RUN   TestMD5
    hello_test.go:29: blob: the WriterOptions.ContentMD5 you specified
        (3858F62230AC3C915F300C664312C63F) did not match what was written
        (D41D8CD98F00B204E9800998ECF8427E) (code=FailedPrecondition)
    hello_test.go:44: 3858F62230AC3C915F300C664312C63F - actual md5
    hello_test.go:45: D41D8CD98F00B204E9800998ECF8427E - empty md5
--- FAIL: TestMD5 (0.00s)
FAIL

Output (Content-MD5 disabled)

=== RUN   TestMD5
    hello_test.go:40: D41D8CD98F00B204E9800998ECF8427E - stored md5 of foo
    hello_test.go:43: 3858F62230AC3C915F300C664312C63F - actual md5
    hello_test.go:44: D41D8CD98F00B204E9800998ECF8427E - empty md5
--- PASS: TestMD5 (0.00s)
PASS

Expected behavior

 === RUN   TestMD5
-    hello_test.go:40: D41D8CD98F00B204E9800998ECF8427E - stored md5 of foo
+    hello_test.go:40: 3858F62230AC3C915F300C664312C63F - stored md5 of foo
     hello_test.go:43: 3858F62230AC3C915F300C664312C63F - actual md5
     hello_test.go:44: D41D8CD98F00B204E9800998ECF8427E - empty md5
 --- PASS: TestMD5 (0.00s)
 PASS

Version

v0.39.0

vangent commented 3 months ago

Thanks for the report, you are right.

I've disabled the Upload optimization when WriterOptions.ContentMD5 is set.