fsouza / fake-gcs-server

Google Cloud Storage emulator & testing library.
https://pkg.go.dev/github.com/fsouza/fake-gcs-server/fakestorage?tab=doc
BSD 2-Clause "Simplified" License
1.01k stars 209 forks source link

Unable to access uploaded file (go + testcontainers) #982

Open rtrzebinski-usc opened 1 year ago

rtrzebinski-usc commented 1 year ago

Hi, here is my issue with the emulator - I use testcontainers for integration test, uploading works fine (I see the file in the container) but there is no way to access it via HTTP or from the code, here is the script to reproduce:

package fsouza

import (
    "bytes"
    "cloud.google.com/go/storage"
    "context"
    "fmt"
    "github.com/docker/go-connections/nat"
    "github.com/stretchr/testify/assert"
    "github.com/testcontainers/testcontainers-go"
    "github.com/testcontainers/testcontainers-go/wait"
    "google.golang.org/api/option"
    "io"
    "log"
    "os"
    "testing"
    "time"
)

func TestUpload(t *testing.T) {
    ctx := context.Background()

    var port = "4443/tcp"
    req := testcontainers.GenericContainerRequest{
        ContainerRequest: testcontainers.ContainerRequest{
            Image:        "fsouza/fake-gcs-server",
            ExposedPorts: []string{port},
            Cmd:          []string{"-scheme", "http", "-port", "4443", "-public-host", "localhost:4443"},
            WaitingFor: wait.ForAll(
                wait.ForLog("server started at"),
            ),
        },
        Started: true,
    }
    container, err := testcontainers.GenericContainer(ctx, req)
    defer container.Terminate(ctx)
    if err != nil {
        log.Fatal(err)
    }

    assert.True(t, container.IsRunning(), "container is not running")

    mappedPort, err := container.MappedPort(ctx, nat.Port(port))
    if err != nil {
        log.Fatal(err)
    }

    log.Println("gcs container ready and running at port: ", mappedPort.Port())

    // Set STORAGE_EMULATOR_HOST environment variable.
    err = os.Setenv("STORAGE_EMULATOR_HOST", fmt.Sprintf("%s:%s", "http://localhost", mappedPort.Port()))
    if err != nil {
        log.Fatalf("os.Setenv: %v", err)
    }

    endpoint := fmt.Sprintf("http://localhost:%s/storage/v1/", mappedPort.Port())
    log.Println("Endpoint:", endpoint)

    s, err := storage.NewClient(ctx, option.WithEndpoint(endpoint))
    if err != nil {
        t.Error(err)
    }

    // file to be uploaded
    var buffer bytes.Buffer
    buffer.WriteString("foo bar")
    bucketName := "foo"
    objectName := "bar"

    bucket := s.Bucket(bucketName)
    if err := bucket.Create(ctx, "abc", nil); err != nil {
        log.Printf("Failed to create bucket: %v", err)
    }

    o := s.Bucket(bucketName).Object(objectName)

    wc := o.NewWriter(ctx)

    written, err := io.Copy(wc, &buffer)

    if err != nil {
        log.Printf("failed to upload file, %v", err)
    }

    fmt.Printf("uploader - written: %v", written)

    err = wc.Close()

    if err != nil {
        t.Error(err)
    }

    log.Println("Creating reader")

    rd, err := bucket.Object(objectName).NewReader(ctx)

    if err != nil {
        log.Printf("failed to create reader, %v", err)
    }

    println("sleeping..")
    time.Sleep(1000 * time.Second)

    res, err := io.ReadAll(rd)

    if err != nil {
        t.Errorf("failed to read, %v", err)
    }

    fmt.Print(res)
}

When you run it will print the base url like http://localhost:49405/storage/v1 - I try to add bucked and object name and access like http://localhost:49405/storage/v1/foo/bar - getting 404.

Then since script is paused by time.Sleep(1000 * time.Second) I can inspect the image - I see that:

invoice-srv $ docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS          PORTS                                         NAMES
ca2561b006f1   fsouza/fake-gcs-server      "/bin/fake-gcs-serve…"   24 seconds ago   Up 22 seconds   0.0.0.0:49407->4443/tcp, :::49402->4443/tcp   eloquent_cohen
e495202a08a7   testcontainers/ryuk:0.3.4   "/app"                   27 seconds ago   Up 25 seconds   0.0.0.0:49406->8080/tcp, :::49401->8080/tcp   condescending_gould
invoice-srv $ docker exec -it ca2561b006f1 sh
/ # ls /storage/foo/bar
/storage/foo/bar
/ # cat /storage/foo/bar
foo bar/ # exit
invoice-srv $ docker logs ca2561b006f1
time="2022-11-06T13:51:13Z" level=info msg="couldn't load any objects or buckets from \"/data\", starting empty"
time="2022-11-06T13:51:13Z" level=info msg="server started at http://[::]:4443"
time="2022-11-06T13:51:16Z" level=info msg="172.17.0.1 - - [06/Nov/2022:13:51:16 +0000] \"GET /storage/v1/ HTTP/1.1\" 404 10"
time="2022-11-06T13:51:16Z" level=info msg="172.17.0.1 - - [06/Nov/2022:13:51:16 +0000] \"GET /favicon.ico HTTP/1.1\" 404 10"
time="2022-11-06T13:51:19Z" level=info msg="172.17.0.1 - - [06/Nov/2022:13:51:19 +0000] \"POST /storage/v1/b?alt=json&prettyPrint=false&project=abc HTTP/1.1\" 200 131"
time="2022-11-06T13:51:19Z" level=info msg="172.17.0.1 - - [06/Nov/2022:13:51:19 +0000] \"POST /upload/storage/v1/b/foo/o?alt=json&name=bar&prettyPrint=false&projection=full&uploadType=multipart HTTP/1.1\" 200 442"
time="2022-11-06T13:51:19Z" level=info msg="172.17.0.1 - - [06/Nov/2022:13:51:19 +0000] \"GET /foo/bar HTTP/1.1\" 404 10"
time="2022-11-06T13:51:20Z" level=info msg="172.17.0.1 - - [06/Nov/2022:13:51:20 +0000] \"GET /storage/v1/foo/bar HTTP/1.1\" 404 10"

What I do wrong? Any help will be appreciated, thank you :)

rtrzebinski-usc commented 1 year ago

I found out that the url to download the file would be http://localhost:49409/download/storage/v1/b/foo/o/bar - but how do I check whether file was uploaded using the code (reader) so I don't receive failed to create reader, storage: object doesn't exist error?

dezyh commented 1 year ago

I'm also able to reproduce the above (with my own implementation). The issue is related to the -public-host argument not being set.

If I adapt the go example to create an object and then read that object, this only works -public-host localhost:8080 is set (ci script). Without it, I also get storage: object doesn't exist when executing it.

Solution

- Cmd:          []string{"-scheme", "http", "-port", "4443", "-public-host", "localhost:4443"}
+ Cmd:          []string{"-scheme", "http", "-port", "4443", "-public-host", "0.0.0.0"}

Journey

Exploring the source code, you can see that one of the downloadObject's MatcherFunc's uses the publicHostMatcher (which uses the public-host argument and publicHost config. https://github.com/fsouza/fake-gcs-server/blob/071372eda6e3cfd4db175a83ef4fa6cb75f024de/fakestorage/server.go#L274-L277

So taking a look at this MatcherFunc, it will match requests only for a specific port if one is given, but falls back to matching any port if none is given. https://github.com/fsouza/fake-gcs-server/blob/071372eda6e3cfd4db175a83ef4fa6cb75f024de/fakestorage/server.go#L293-L300

dezyh commented 1 year ago

The documentation in the README mentions setting -public-host but doesn't mention that

  1. It is required to set -public-host for downloadObject (just for pre-signed urls).
  2. It's possible to omit a port when specifying -public-host

It might be nice to document this, because it also cost me about 2 hours to work through everything (especially because I initially thought it was an implementation bug in my code)?

fsouza commented 1 year ago

@dezyh thanks for digging into this. Since you did the investigation, do you also want to send a PR with improvements to the docs? I can do it if you prefer!

dezyh commented 1 year ago

I'll make a readme PR, just wanted to check that I wasn't missing something obvious.

fsouza commented 1 year ago

@dezyh oh, that sounds good. I'll dig into that sample code to see if there's a better fix, but yeah sounds like we need to clarify the role of public-host in the docs.

sourabhsparkala commented 1 year ago

@fsouza @dezyh I followed the instructions given here and I just ran the code given above and set the public-host to 0.0.0.0. But I still face the same issue. Am I missing anything?

2022/11/22 12:12:41 failed to create reader, storage: object doesn't exist
sleeping..

@rtrzebinski-usc were you able to run this test case successfully?

dezyh commented 1 year ago

Sorry, I was busy for a while.

I'm personally using dockertest and running containers inside a docker network. This works perfectly both local and on GitHub Actions with my above comment.

One small subtlety, is that connecting through the network's hostname for the container (not sure on terminology) doesn't work in GitHub Actions. While this is partly dockertest specific, it might also apply with testcontainers.

CloudStorageEndpoint = fmt.Sprintf("%s:%s", cloudStorageContainer.Container.NetworkSettings.Networks["test"].IPAddress, CloudStoragePort)

I have to connect through a port that's bound to the default network (again, sorry don't know the correct terminology)

CloudStorageEndpoint = fmt.Sprintf("0.0.0.0:%s", cloudStorageContainer.GetPort(fmt.Sprintf("%s/tcp", CloudStoragePort)))