googleapis / google-cloud-go

Google Cloud Client Libraries for Go.
https://cloud.google.com/go/docs/reference
Apache License 2.0
3.78k stars 1.3k forks source link

pubsub: deadline exceeds when behind VPN #11011

Open boranby opened 1 month ago

boranby commented 1 month ago

Client

PubSub

Environment

Redhat Enterprise Linux (no container)

Code and Dependencies

package main

func main() {
  // ...
  client, err := pubsub.NewClient(ctx, projectID)
  // ...
  t := client.Topic(topicID)
  // ...
  msg, err := proto.Marshal(message)
  result := t.Publish(ctx, &pubsub.Message {
    Data: msg,
  })

  id, err := result.Get(ctx)
  if err != nil {
    return fmt.Errorf("pubsub: Get: %w", err)
  }
  log.Printf("Published a message; msg ID: %v\n", id)
}
go.mod ```text module modname go 1.23.0 require ( cloud.google.com/go/pubsub v1.44.0 google.golang.org/protobuf v1.35.1 ) ```

Expected behavior

It shouldn't matter for Pub/Sub client that the computer it runs has an open VPN connection. It should publish without any issues.

Actual behavior

If I enable VPN to access a private network from the machine, Pub/Sub publish returns context deadline exceeded error. When I close the VPN, it publishes and get the id of the publish. It can generate the client, get the Topic, list the topics. However, it stucks at the result.Get(ctx).

hongalex commented 1 month ago

Thanks for filing this issue.

  1. Is this something that has always happened, or started happening v1.44.0?
  2. Just to clarify, are you actually calling c.Topics in your code when you say you "list the topics"? I couldn't find it in that snippet you listed above and I wonder if the first API call being made is actually Publish
  3. Can you test that this happens with any other gRPC-based library (e.g. with Firestore or any other Go library in this repo)
boranby commented 1 month ago

Hi @hongalex , thanks for your quick response.

  1. I started development with v1.44.0, I haven't tried the previous versions.
  2. You are right. It can't also list the topics. I commented out the list topics sections and remember it wrong, it also stuck.
  3. Tried using this example. https://github.com/GoogleCloudPlatform/golang-samples/blob/main/firestore/increment.go It also stuck at the line dc.Update
hongalex commented 1 month ago

Hm, given that this issue occurs with other products like Firestore, this likely isn't an issue with our library specifically. I would recommend checking your VPN settings to narrow down the issue: is it an issue with gRPC specifically or do HTTP calls also timeout? Is it something to do with the VPN connection itself?

ekartsev commented 4 weeks ago

I'm seeing something similar.

Is this something that has always happened, or started happening v1.44.0?

For us the v1.44.0 is when it started happening.

I double-checked v1.43.0 and it works as expected. However after upgrading to v1.44.0 messages seem to never land in Receive():

err := sub.Receive(ctx, func(ctx context.Context, message *pubsub.Message) {
    // this callback is never called
}

Another interesting observation is:

exists, err := sub.Exists(ctx)

^ that returns true in v1.43.0 and false in v1.44.0.

hongalex commented 4 weeks ago

@ekartsev just to confirm, you're also encountering issues with a VPN enabled?

ekartsev commented 4 weeks ago

No, in my case the service is running in AWS.. I'm not sure about the actual network layout there, but it's probably a VPC with firewall, etc.

It doesn't explain why 1.43.0 works, though 🤷

hongalex commented 4 weeks ago

Hm weird, our integration tests haven't caught anything that broke Receive nor Exists in 1.44.

Since it's not clear if the issue is with a firewall, and you're not explicitly seeing deadline exceeded, can you create a new issue for your problem?

On that issue, could you also include:

  1. Whether calling other methods (such as Topic.Config) result in an error
  2. Whether this is happening with other gRPC-based library that are not pub/sub
zachbadgett commented 4 weeks ago

I had a similar issue with another library. In my case, it was caused by internaloption.EnableNewAuthLibrary() being added to the default GRPC options. Setting the ENV GOOGLE_API_GO_EXPERIMENTAL_DISABLE_NEW_AUTH_LIB to true fixed it.

codyoss commented 4 weeks ago

@zachbadgett do you have a minimal reproducer? In your case is there any special networking?

dkstyle0 commented 4 weeks ago

I had a similar issue with another library. In my case, it was caused by internaloption.EnableNewAuthLibrary() being added to the default GRPC options. Setting the ENV GOOGLE_API_GO_EXPERIMENTAL_DISABLE_NEW_AUTH_LIB to true fixed it.

Adding this ENV variable to our service fixed the issue as well. Maybe we are missing some transitive dependency that makes this work, but for us 1.44+ doesn't work without adding this ENV variable.

zachbadgett commented 4 weeks ago

@codyoss yes, there's egress restrictions. I figured it's most likely caused by a new url being hit that is not allowed yet, just haven't spent the time to figure it out.

codyoss commented 4 weeks ago

@zachbadgett If you do find out, please share!

Ramsey-B commented 3 weeks ago

I encountered this same issue with the cloud.google.com/go/kms/apiv1 KeyManagementClient. Setting GOOGLE_API_GO_EXPERIMENTAL_DISABLE_NEW_AUTH_LIB=true also resolved the issue.

hongalex commented 2 weeks ago

For those using Pub/Sub, can you try upgrading to the latest version 1.45.1 and see if the issue still remains?