googleapis / google-cloud-go

Google Cloud Client Libraries for Go.
https://cloud.google.com/go/docs/reference
Apache License 2.0
3.73k stars 1.28k forks source link

storage: timeout for /computeMetadata/v1/universe/universe_domain in storage.NewClient #9350

Closed jaimem88 closed 8 months ago

jaimem88 commented 8 months ago

Environment details

Steps to reproduce

The latest update is breaking our pipeline when we try to update the library: https://gitlab.com/gitlab-org/container-registry/-/merge_requests/1569.

One of our integration tests keeps failing as seen in this CI job.

panic: Error creating storage client: dialing: Get "http://169.254.169.254/computeMetadata/v1/universe/universe_domain": dial tcp 169.254.169.254:80: i/o timeout
goroutine 1 [running]:
github.com/docker/distribution/registry/storage/driver/gcs.init.1()
    /builds/gitlab-org/container-registry/registry/storage/driver/gcs/gcs_test.go:87 +0x8c8
FAIL    github.com/docker/distribution/registry/storage/driver/gcs  12.910s
=== Failed
=== FAIL: registry/storage/driver/gcs  (0.00s)
panic: Error creating storage client: dialing: Get "http://169.254.169.254/computeMetadata/v1/universe/universe_domain": dial tcp 169.254.169.254:80: i/o timeout
goroutine 1 [running]:
github.com/docker/distribution/registry/storage/driver/gcs.init.1()
    /builds/gitlab-org/container-registry/registry/storage/driver/gcs/gcs_test.go:87 +0x8c8
FAIL    github.com/docker/distribution/registry/storage/driver/gcs  12.910s
DONE 0 tests, 1 failure in 81.895s
exit status 1

This seems suspiciously related to https://github.com/googleapis/google-api-go-client/commit/b21a1fa29bb072f7063dfe6e2fe81ac7b8bb5932. I am not sure where the hostname is coming from but it seems that the DefaultEndpoint is not being set properly.

quartzmo commented 8 months ago

@jaimem88 Thank you for reporting this bug. I will try to reproduce it today.

quartzmo commented 8 months ago

@jaimem88 I have been unable to reproduce it so far.

The request that times out according to your logs is to the Compute Metadata Server. It should only be made if the default credentials are determined to be provided by the Compute Metadata Server. Just from static inspection (ie, looking) the bug appears to be in this line.

quartzmo commented 8 months ago

Do you know if your CI runs on GCE and if so, exactly which GCE-based product? (GKE, Cloud Run?). It seems metadata.OnGCE() is returning true but the Metadata Server endpoint is timing out.

quartzmo commented 8 months ago

Can you provide the full go.mod settings?

jaimem88 commented 8 months ago

Thanks for looking into this @quartzmo!

Can you provide the full go.mod settings?

Here are the full go.mod and go.sum files.

module github.com/docker/distribution

go 1.18

require (
    cloud.google.com/go/storage v1.37.0
    github.com/Azure/azure-sdk-for-go v68.0.0+incompatible
    github.com/DATA-DOG/go-sqlmock v1.5.2
    github.com/Shopify/toxiproxy/v2 v2.7.0
    github.com/alicebob/miniredis/v2 v2.31.1
    github.com/aws/aws-sdk-go v1.46.7
    github.com/benbjohnson/clock v1.3.5
    github.com/cenkalti/backoff/v4 v4.2.1
    github.com/denverdino/aliyungo v0.0.0-20230411124812-ab98a9173ace
    github.com/docker/go-metrics v0.0.1
    github.com/docker/libtrust v0.0.0-20160708172513-aabc10ec26b7
    github.com/eko/gocache/lib/v4 v4.1.5
    github.com/eko/gocache/store/redis/v4 v4.2.0
    github.com/getsentry/sentry-go v0.26.0
    github.com/go-redis/redismock/v9 v9.0.3
    github.com/golang/mock v1.6.0
    github.com/gorilla/handlers v1.5.2
    github.com/gorilla/mux v1.8.1
    github.com/hashicorp/go-multierror v1.1.1
    github.com/jackc/pgerrcode v0.0.0-20220416144525-469b46aa5efa
    github.com/jackc/pgx/v5 v5.5.2
    github.com/jszwec/csvutil v1.9.0
    github.com/mitchellh/mapstructure v1.5.0
    github.com/ncw/swift v1.0.53
    github.com/olekukonko/tablewriter v0.0.5
    github.com/opencontainers/go-digest v1.0.0
    github.com/opencontainers/image-spec v1.0.2
    github.com/prometheus/client_golang v1.18.0
    github.com/redis/go-redis/v9 v9.4.0
    github.com/rubenv/sql-migrate v1.5.2
    github.com/schollz/progressbar/v3 v3.14.1
    github.com/sirupsen/logrus v1.9.3
    github.com/spf13/cobra v1.8.0
    github.com/spf13/viper v1.18.2
    github.com/stretchr/testify v1.8.4
    github.com/trim21/go-redis-prometheus v0.0.0
    github.com/vmihailenco/msgpack/v5 v5.4.1
    github.com/xanzy/go-gitlab v0.96.0
    gitlab.com/gitlab-org/labkit v1.21.0
    go.uber.org/automaxprocs v1.5.3
    golang.org/x/crypto v0.18.0
    golang.org/x/oauth2 v0.16.0
    golang.org/x/sync v0.6.0
    golang.org/x/time v0.5.0
    google.golang.org/api v0.161.0
    gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
    gopkg.in/yaml.v2 v2.4.0
)

require (
    cloud.google.com/go v0.112.0 // indirect
    cloud.google.com/go/compute v1.23.3 // indirect
    cloud.google.com/go/compute/metadata v0.2.3 // indirect
    cloud.google.com/go/iam v1.1.5 // indirect
    cloud.google.com/go/profiler v0.1.0 // indirect
    github.com/Azure/go-autorest v14.2.0+incompatible // indirect
    github.com/Azure/go-autorest/autorest v0.11.29 // indirect
    github.com/Azure/go-autorest/autorest/adal v0.9.23 // indirect
    github.com/Azure/go-autorest/autorest/date v0.3.0 // indirect
    github.com/Azure/go-autorest/autorest/to v0.4.0 // indirect
    github.com/Azure/go-autorest/logger v0.2.1 // indirect
    github.com/Azure/go-autorest/tracing v0.6.0 // indirect
    github.com/alicebob/gopher-json v0.0.0-20200520072559-a9ecdc9d1d3a // indirect
    github.com/beorn7/perks v1.0.1 // indirect
    github.com/cespare/xxhash/v2 v2.2.0 // indirect
    github.com/client9/reopen v1.0.0 // indirect
    github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
    github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
    github.com/dnaeon/go-vcr v1.0.1 // indirect
    github.com/felixge/httpsnoop v1.0.4 // indirect
    github.com/fsnotify/fsnotify v1.7.0 // indirect
    github.com/go-gorp/gorp/v3 v3.1.0 // indirect
    github.com/go-logr/logr v1.4.1 // indirect
    github.com/go-logr/stdr v1.2.2 // indirect
    github.com/gofrs/uuid v4.4.0+incompatible // indirect
    github.com/golang-jwt/jwt/v4 v4.5.0 // indirect
    github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
    github.com/golang/protobuf v1.5.3 // indirect
    github.com/google/go-querystring v1.1.0 // indirect
    github.com/google/pprof v0.0.0-20210804190019-f964ff605595 // indirect
    github.com/google/s2a-go v0.1.7 // indirect
    github.com/google/uuid v1.5.0 // indirect
    github.com/googleapis/enterprise-certificate-proxy v0.3.2 // indirect
    github.com/googleapis/gax-go/v2 v2.12.0 // indirect
    github.com/hashicorp/errwrap v1.1.0 // indirect
    github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
    github.com/hashicorp/go-retryablehttp v0.7.2 // indirect
    github.com/hashicorp/hcl v1.0.0 // indirect
    github.com/inconshreveable/mousetrap v1.1.0 // indirect
    github.com/jackc/pgpassfile v1.0.0 // indirect
    github.com/jackc/pgservicefile v0.0.0-20221227161230-091c0ba34f0a // indirect
    github.com/jackc/puddle/v2 v2.2.1 // indirect
    github.com/jmespath/go-jmespath v0.4.0 // indirect
    github.com/kr/pretty v0.3.1 // indirect
    github.com/kr/text v0.2.0 // indirect
    github.com/magiconair/properties v1.8.7 // indirect
    github.com/mattn/go-runewidth v0.0.12 // indirect
    github.com/matttproud/golang_protobuf_extensions/v2 v2.0.0 // indirect
    github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db // indirect
    github.com/oklog/ulid/v2 v2.0.2 // indirect
    github.com/opentracing/opentracing-go v1.2.0 // indirect
    github.com/pelletier/go-toml/v2 v2.1.0 // indirect
    github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
    github.com/prometheus/client_model v0.5.0 // indirect
    github.com/prometheus/common v0.45.0 // indirect
    github.com/prometheus/procfs v0.12.0 // indirect
    github.com/rivo/uniseg v0.4.4 // indirect
    github.com/rogpeppe/go-internal v1.10.0 // indirect
    github.com/sagikazarmark/locafero v0.4.0 // indirect
    github.com/sagikazarmark/slog-shim v0.1.0 // indirect
    github.com/sebest/xff v0.0.0-20210106013422-671bd2870b3a // indirect
    github.com/sourcegraph/conc v0.3.0 // indirect
    github.com/spf13/afero v1.11.0 // indirect
    github.com/spf13/cast v1.6.0 // indirect
    github.com/spf13/pflag v1.0.5 // indirect
    github.com/subosito/gotenv v1.6.0 // indirect
    github.com/vmihailenco/tagparser/v2 v2.0.0 // indirect
    github.com/yuin/gopher-lua v1.1.0 // indirect
    go.opencensus.io v0.24.0 // indirect
    go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.47.0 // indirect
    go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.47.0 // indirect
    go.opentelemetry.io/otel v1.22.0 // indirect
    go.opentelemetry.io/otel/metric v1.22.0 // indirect
    go.opentelemetry.io/otel/trace v1.22.0 // indirect
    go.uber.org/atomic v1.9.0 // indirect
    go.uber.org/multierr v1.9.0 // indirect
    golang.org/x/exp v0.0.0-20230905200255-921286631fa9 // indirect
    golang.org/x/net v0.20.0 // indirect
    golang.org/x/sys v0.16.0 // indirect
    golang.org/x/term v0.16.0 // indirect
    golang.org/x/text v0.14.0 // indirect
    google.golang.org/appengine v1.6.8 // indirect
    google.golang.org/genproto v0.0.0-20240116215550-a9fa1716bcac // indirect
    google.golang.org/genproto/googleapis/api v0.0.0-20240122161410-6c6643bf1457 // indirect
    google.golang.org/genproto/googleapis/rpc v0.0.0-20240116215550-a9fa1716bcac // indirect
    google.golang.org/grpc v1.60.1 // indirect
    google.golang.org/protobuf v1.32.0 // indirect
    gopkg.in/ini.v1 v1.67.0 // indirect
    gopkg.in/yaml.v3 v3.0.1 // indirect
)

Do you know if your CI runs on GCE and if so, exactly which GCE-based product? (GKE, Cloud Run?). It seems metadata.OnGCE() is returning true but the Metadata Server endpoint is timing out.

I believe our gitlab-runner runs on GKE, and the job is executed from a golang:1.21 container.

Just for reference, the same job on the previous version v0.157.0 of the library runs fine https://gitlab.com/gitlab-org/container-registry/-/jobs/6058207860.

quartzmo commented 8 months ago

@jaimem88 Are you still seeing this issue? If so, would it be possible for you to run the following curl commands in your GKE environment in order to test if the failing request is transient or consistent?

The commands are almost identical, the difference is just IP address vs domain name. As you can see from the IP address, the URL in your logs is correct. It is not supposed to timeout, but the rollout of this feature seems to possibly be inconsistent. Otoh, the timeout might have been a rare transient issue such as a restart. Thank you!

curl -H "Metadata-Flavor:Google"  http://169.254.169.254/computeMetadata/v1/universe/universe_domain
curl -H "Metadata-Flavor:Google" http://metadata.google.internal/computeMetadata/v1/universe/universe_domain
quartzmo commented 8 months ago

@jaimem88 Please upgrade to https://github.com/googleapis/google-api-go-client/releases/tag/v0.162.0 if possible, and let us know if the error is resolved.

jaimem88 commented 8 months ago

Thank you @quartzmo! The latest version works as before https://gitlab.com/gitlab-org/container-registry/-/jobs/6111764468 🎉

clerk-zach commented 7 months ago

We are experiencing this issue upgrading logging from 0.7.0 to 0.9.0

codyoss commented 7 months ago

@clerk-zach Did you try the above workaround?