goccy / bigquery-emulator

BigQuery emulator server implemented in Go
MIT License
795 stars 105 forks source link

Deleting dataset doesn't delete tables (and creating a table that exists hangs forever) #341

Open coxley opened 1 month ago

coxley commented 1 month ago

What happened?

It's common in our tests to bootstrap a dataset, let library code create as many tables as needed, then cleanup the dataset after the test finishes. Removing a dataset should remove all tables and other resources inside it.

The emulator doesn't appear to do this. Additionally, the errors returned from the emulator are 500 which the Google SDKs (at least for Go) retry indefinitely on. This is likely the cause of the deadlock: source / source

What did you expect to happen?

Two things:

How can we reproduce it (as minimally and precisely as possible)?

cd $(mktemp -d)
go mod init test
# Copy below test file to main_test.go
go mod tidy
go test .
package main

import (
    "context"
    "sync"
    "testing"

    "cloud.google.com/go/bigquery"
    "github.com/stretchr/testify/require"
    "github.com/testcontainers/testcontainers-go"
    "github.com/testcontainers/testcontainers-go/wait"
    "google.golang.org/api/option"
)

const (
    bqPort      = "9050/tcp"
    testProject = "project"
)

// startBigQuery spins up a test container and blocks until it's ready. Only one
// container can be started per test binary.
//
// Sharing a container across tests improves suite duration significantly
var startBigQuery = sync.OnceValues(func() (testcontainers.Container, error) {
    ctx := context.Background()
    return testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
        ContainerRequest: testcontainers.ContainerRequest{
            Image: "ghcr.io/goccy/bigquery-emulator:latest",
            Cmd: []string{
                "--project=" + testProject,
            },
            ExposedPorts: []string{bqPort},
            WaitingFor:   wait.ForExposedPort(),
        },
        Started: true,
    })
})

func TestBasic(t *testing.T) {
    t.Run("one", func(t *testing.T) {
        testHelper(t)
    })

    t.Run("two", func(t *testing.T) {
        testHelper(t)
    })
}

func testHelper(t *testing.T) {
    ctx := context.Background()

    // Startup emulator and create client
    container, err := startBigQuery()
    require.NoError(t, err)

    addr, err := container.Endpoint(ctx, "")
    require.NoError(t, err)

    endpoint := "http://" + addr
    client, err := bigquery.NewClient(
        ctx,
        testProject,
        option.WithoutAuthentication(),
        option.WithEndpoint(endpoint),
    )
    require.NoError(t, err)

    // Create dataset and table within
    ds := client.Dataset("main")
    err = ds.Create(ctx, nil)
    require.NoError(t, err)

    // Cleanup entire dataset on test finish
    t.Cleanup(func() {
        require.NoError(t, ds.Delete(ctx))
    })

    t.Log("creating table inside dataset")
    table := ds.Table("data")
    err = table.Create(ctx, &bigquery.TableMetadata{
        Schema: bigquery.Schema{
            &bigquery.FieldSchema{Name: "value", Type: bigquery.NumericFieldType},
        },
    })
    require.NoError(t, err)

    // This prevents the test from deadlocking indefinitely, but removing the entire
    // dataset should be enough.
    //
    // Either that, or returning an appropriate error.
    //
    // t.Cleanup(func() {
    //  require.NoError(t, table.Delete(ctx))
    // })
    t.Log("table created")
}

func ignoreErr[T any](v T, err error) T {
    if err != nil {
        panic(err)
    }
    return v
}

Anything else we need to know?

No response

ohaibbq commented 1 month ago

Does it work if you use dataset.DeleteWithContents()? Tables are not cleared from datasets during deletion by default.

I agree that the internalError response status causes lots of headaches with client library retries.

coxley commented 1 month ago

@ohaibbq: Oh dang, I completely missed that in the documentation. Thank you very much.

I guess my real issue is with internalError then.