googleapis / google-cloud-go

Google Cloud Client Libraries for Go.
https://cloud.google.com/go/docs/reference
Apache License 2.0
3.72k stars 1.27k forks source link

hanging in pubsub.NewClient #856

Closed willie closed 6 years ago

willie commented 6 years ago

Last night, around 11:32 pm Central Standard Time, according to our stackdriver charts, my pubsub subscriber stopped receiving pubsub messages (on Linux). Up until this point, with your teams help on a previous issue, this program processed millions of messages. And then it stopped receiving them after weeks of constant processing. I thought perhaps that the publisher was down until I looked this morning.

The subscriber program was running, but it was just sitting. Thinking it was some kind of weird connection thing, I killed the program and started it back up again. It cranked up and sat, still processing no messages for minutes. I have a context handler that allows me to cancel with sigterm, so I did and this was the error I got back from the pubsub.NewClient call:

constructing pubsub client: context canceled

Thinking it was a credentials issue, I authed with those credentials locally (on macOS) and ran:

gcloud pubsub subscriptions pull _____

That returned an event successfully. So credentials are OK, and the subscription does have messages to process.

Thinking that perhaps the client libraries were out of date (I'm scrambling for anything here), I do this:

go get -u cloud.google.com/go/pubsub go get -u golang.org/x/net/context go get -u google.golang.org/grpc

I rebuild the subscriber program and run it again. Nothing. After a few minutes, I cancel and it returns a less descriptive error:

pubsub: context canceled

Any ideas? I'm dead in the water. I don't know what else to try and do, or why it stopped working when it did.

willie commented 6 years ago

Here's the call where it never returns on it's own:

client, err := pubsub.NewClient(ctx, *projID, option.WithCredentialsFile(*confPath))

where confPath is a *string containing the name of a GCP service account JSON file.

jba commented 6 years ago

@zombiezen @pongad

My first thought is #791. Are you now running in a docker container without system certs?

Try using kill -ABRT to get a stack trace (http://pro-tips-dot-com.tumblr.com/post/47677612115/kill-a-hung-go-process-and-print-stack-traces).

Your gcloud CLI is using your application default credentials, while your program gets creds from a file. I believe if you omit the option from the call to pubsub.NewClient, you'll be doing what gcloud is doing. What happens when you do that?

zombiezen commented 6 years ago

This isn't the problem, but there is a TODO in pubsub.NewClient that adds grpc.WithBlock which likely should be removed. https://github.com/GoogleCloudPlatform/google-cloud-go/blob/cfc64939debeed5225d793a5300a613a55c168c2/pubsub/pubsub.go#L71

willie commented 6 years ago

I'm having the same hang on Linux and macOS, neither in docker containers. The Linux is sort of old. macOS is very new. I'll try using application default credentials.

$ uname -a
Linux ____ 3.13.0-98-generic #145~precise1-Ubuntu SMP Sat Oct 8 20:16:31 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ uname -a
Darwin _____ 17.3.0 Darwin Kernel Version 17.3.0: Thu Nov  9 18:09:22 PST 2017; root:xnu-4570.31.3~1/RELEASE_X86_64 x86_64

kill -ABRT provides a lot of info. Here is a selected portion with my old code on Linux:

google.golang.org/grpc.(*ClientConn).WaitForStateChange(0xc4202e2000, 0xd22bc0, 0xc4202961c0, 0x3, 0x1)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:637 +0x13f
google.golang.org/grpc.DialContext(0xd22bc0, 0xc4202961c0, 0xa44fb9, 0x19, 0xc42029a6c0, 0x5, 0x6, 0x0, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:549 +0x6ed
google.golang.org/api/transport/grpc.dial(0xd22bc0, 0xc4202961c0, 0x411b00, 0xc420262180, 0x6, 0x6, 0xc420059b50, 0x441564, 0xc420262180)
    /Users/willie/src/google.golang.org/api/transport/grpc/dial.go:82 +0x269
google.golang.org/api/transport/grpc.Dial(0xd22bc0, 0xc4202961c0, 0xc420262180, 0x6, 0x6, 0xc420294440, 0xc420262180, 0x19)
    /Users/willie/src/google.golang.org/api/transport/grpc/dial.go:37 +0x58
google.golang.org/api/transport.DialGRPC(0xd22bc0, 0xc4202961c0, 0xc420262180, 0x6, 0x6, 0x4, 0x6, 0x4)
    /Users/willie/src/google.golang.org/api/transport/dial.go:43 +0x53
cloud.google.com/go/pubsub/apiv1.NewPublisherClient(0xd22bc0, 0xc4202961c0, 0xc420262120, 0x4, 0x6, 0x1, 0x1, 0x1)
    /Users/willie/src/cloud.google.com/go/pubsub/apiv1/publisher_client.go:117 +0xfe
cloud.google.com/go/pubsub.NewClient(0xd22bc0, 0xc4202961c0, 0x7ffe3c3fe71a, 0xd, 0xc420059eb0, 0x1, 0x1, 0x4105e7, 0x7fd7e0ef3300, 0xc420059e58)
    /Users/willie/src/cloud.google.com/go/pubsub/pubsub.go:79 +0x2b2
main.main()
    /Users/willie/src/github.com/vitalsource/chump/chump.go:213 +0x83d

goroutine 50 [select]:
google.golang.org/grpc.(*ccResolverWrapper).watcher(0xc42029a750)
    /Users/willie/src/google.golang.org/grpc/resolver_conn_wrapper.go:110 +0x1ac
created by google.golang.org/grpc.(*ccResolverWrapper).start
    /Users/willie/src/google.golang.org/grpc/resolver_conn_wrapper.go:96 +0x3f

goroutine 51 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc4202963c0)
    /Users/willie/src/google.golang.org/grpc/balancer_conn_wrappers.go:122 +0x156
created by google.golang.org/grpc.newCCBalancerWrapper
    /Users/willie/src/google.golang.org/grpc/balancer_conn_wrappers.go:113 +0x197

goroutine 52 [chan receive]:
google.golang.org/api/internal.(*PoolResolver).Next(0xc4202944c0, 0xc4202f6f68, 0x0, 0xc42018a300, 0x18, 0x18)
    /Users/willie/src/google.golang.org/api/internal/pool.go:54 +0x54
google.golang.org/grpc.(*roundRobin).watchAddrUpdates(0xc4202621e0, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/balancer.go:159 +0x64
google.golang.org/grpc.(*roundRobin).Start.func1(0xc4202621e0)
    /Users/willie/src/google.golang.org/grpc/balancer.go:234 +0x2b
created by google.golang.org/grpc.(*roundRobin).Start
    /Users/willie/src/google.golang.org/grpc/balancer.go:232 +0x14f

goroutine 53 [chan receive]:
google.golang.org/grpc.(*balancerWrapper).lbWatcher(0xc4202b2310)
    /Users/willie/src/google.golang.org/grpc/balancer_v1_wrapper.go:124 +0xf9
created by google.golang.org/grpc.(*balancerWrapperBuilder).Build
    /Users/willie/src/google.golang.org/grpc/balancer_v1_wrapper.go:62 +0x2f3

goroutine 40 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4200ad680, 0x5, 0x0, 0xbe8e12f155f09350, 0x6415923ce, 0xd69420, 0xbe8e12f39a98a816, 0x85e725294, 0xd69420, 0xc420266b00, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4200ad680, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4200ad680)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 41 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4200adb80, 0x5, 0x0, 0xbe8e12f0ee407537, 0x5e27371b4, 0xd69420, 0xbe8e12f36191f33f, 0x829d0d3bc, 0xd69420, 0xc42030a700, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4200adb80, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4200adb80)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 42 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308000, 0x5, 0x0, 0xbe8e12f0bab10c96, 0x5b3493f43, 0xd69420, 0xbe8e12f3a30d832e, 0x866e72ddb, 0xd69420, 0xc42030abc0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308000, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308000)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 43 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308280, 0x5, 0x0, 0xbe8e12f155f9733f, 0x6416203e2, 0xd69420, 0xbe8e12f376ef5a39, 0x83f2e3adc, 0xd69420, 0xc420266580, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308280, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308280)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 44 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308500, 0x5, 0x0, 0xbe8e12f1763debc4, 0x661a67c5b, 0xd69420, 0xbe8e12f3c2d46c7b, 0x88248e112, 0xd69420, 0xc420296e80, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308500, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308500)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 45 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308780, 0x5, 0x0, 0xbe8e12f0d4ee16d6, 0x5c921133d, 0xd69420, 0xbe8e12f34e65515b, 0x816a431c2, 0xd69420, 0xc420266400, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308780, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308780)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 46 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308a00, 0x5, 0x0, 0xbe8e12f11246464f, 0x602140cf7, 0xd69420, 0xbe8e12f38e533457, 0x8522cdeff, 0xd69420, 0xc420296a40, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308a00, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308a00)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 47 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308c80, 0x5, 0x0, 0xbe8e12f11a47237b, 0x60a14ea00, 0xd69420, 0xbe8e12f36c646a56, 0x834a34adb, 0xd69420, 0xc42030a800, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308c80, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308c80)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 48 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420308f00, 0x5, 0x0, 0xbe8e12f1deae84ba, 0x6c14ca956, 0xd69420, 0xbe8e12f3da6f3c13, 0x899e3b0af, 0xd69420, 0xc420266e80, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420308f00, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420308f00)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 65 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420309180, 0x5, 0x0, 0xbe8e12f1b79d1709, 0x69ea071b3, 0xd69420, 0xbe8e12f3c769d6b7, 0x886de4b61, 0xd69420, 0xc420297000, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420309180, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420309180)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 66 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420309400, 0x5, 0x0, 0xbe8e12f125f63c79, 0x615c402f7, 0xd69420, 0xbe8e12f36556d588, 0x82d95b606, 0xd69420, 0xc4202664c0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420309400, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420309400)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 67 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420309680, 0x5, 0x0, 0xbe8e12f1ce4d3672, 0x6b0eb5ae5, 0xd69420, 0xbe8e12f3c09a1e07, 0x8800e927a, 0xd69420, 0xc420266cc0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420309680, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420309680)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 68 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420309900, 0x5, 0x0, 0xbe8e12f170b0daf0, 0x65c196b91, 0xd69420, 0xbe8e12f378502736, 0x8408f07d7, 0xd69420, 0xc420296880, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420309900, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420309900)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 69 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420309b80, 0x5, 0x0, 0xbe8e12f0c1bb7799, 0x5b5ee7428, 0xd69420, 0xbe8e12f39ec19b54, 0x8629b45e3, 0xd69420, 0xc42030aac0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420309b80, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420309b80)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 70 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318000, 0x5, 0x0, 0xbe8e12f1a5030a42, 0x68c0664dd, 0xd69420, 0xbe8e12f3ea9f1f25, 0x8aa1393c0, 0xd69420, 0xc420297140, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318000, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318000)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 71 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318280, 0x5, 0x0, 0xbe8e12f1646862e3, 0x64fd0f3ae, 0xd69420, 0xbe8e12f3880aacec, 0x84be457b7, 0xd69420, 0xc420296940, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318280, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318280)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 72 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318500, 0x6, 0x0, 0xbe8e12f3f42d35e4, 0x8b3a1aa82, 0xd69420, 0xbe8e12f59087eaba, 0xa3137e558, 0xd69420, 0xc42030b000, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318500, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318500)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 73 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318780, 0x5, 0x0, 0xbe8e12f0d7afb89e, 0x5cbe2b53e, 0xd69420, 0xbe8e12f3bb6ade3c, 0x87f4488dc, 0xd69420, 0xc42030acc0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318780, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318780)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 74 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318a00, 0x5, 0x0, 0xbe8e12f173ec21c9, 0x65f54b2ad, 0xd69420, 0xbe8e12f3e15bb438, 0x8a0d0291c, 0xd69420, 0xc420266f40, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318a00, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318a00)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 75 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318c80, 0x6, 0x0, 0xbe8e12f55bd40114, 0xa00e931ad, 0xd69420, 0xbe8e12f595fbb810, 0xa36abb2a9, 0xd69420, 0xc42030b0c0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318c80, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318c80)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 76 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420318f00, 0x5, 0x0, 0xbe8e12f12d6a0f5b, 0x61d37d5f6, 0xd69420, 0xbe8e12f36d29c658, 0x83568a6f3, 0xd69420, 0xc42030a8c0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420318f00, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420318f00)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 77 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420319180, 0x5, 0x0, 0xbe8e12f1708f52d1, 0x65bf7e3b0, 0xd69420, 0xbe8e12f3c4c218ef, 0x884368dce, 0xd69420, 0xc420296f40, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420319180, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420319180)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 78 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420319400, 0x5, 0x0, 0xbe8e12f12e99a102, 0x61e6767a0, 0xd69420, 0xbe8e12f3922df5d7, 0x85607a075, 0xd69420, 0xc420296b00, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420319400, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420319400)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 79 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc420319680, 0x5, 0x0, 0xbe8e12f121df8341, 0x611ad49f8, 0xd69420, 0xbe8e12f39b83d564, 0x85f5d801b, 0xd69420, 0xc420296bc0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc420319680, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc420319680)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb
willie commented 6 years ago

Still hanging with Application Default Credentials. (This is a run on macOS.)

goroutine 1 [select]:
google.golang.org/grpc.(*ClientConn).WaitForStateChange(0xc4202bc000, 0x1935040, 0xc420274180, 0x3, 0x1)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:637 +0x13f
google.golang.org/grpc.DialContext(0x1935040, 0xc420274180, 0x1650a63, 0x19, 0xc42026e480, 0x5, 0x6, 0x0, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:549 +0x6ed
google.golang.org/api/transport/grpc.dial(0x1935040, 0xc420274180, 0x1013a00, 0xc42027e0f0, 0x5, 0x5, 0xc420053b50, 0x1043024, 0xc42027e0f0)
    /Users/willie/src/google.golang.org/api/transport/grpc/dial.go:82 +0x269
google.golang.org/api/transport/grpc.Dial(0x1935040, 0xc420274180, 0xc42027e0f0, 0x5, 0x5, 0xc4202763c0, 0xc42027e0f0, 0x19)
    /Users/willie/src/google.golang.org/api/transport/grpc/dial.go:37 +0x58
google.golang.org/api/transport.DialGRPC(0x1935040, 0xc420274180, 0xc42027e0f0, 0x5, 0x5, 0x3, 0x3, 0x3)
    /Users/willie/src/google.golang.org/api/transport/dial.go:43 +0x53
cloud.google.com/go/pubsub/apiv1.NewPublisherClient(0x1935040, 0xc420274180, 0xc420053d28, 0x3, 0x3, 0x0, 0x0, 0x0)
    /Users/willie/src/cloud.google.com/go/pubsub/apiv1/publisher_client.go:117 +0xfe
cloud.google.com/go/pubsub.NewClient(0x1935040, 0xc420274180, 0x7ffeefbffac1, 0xd, 0x0, 0x0, 0x0, 0x10124b7, 0x1e04688, 0xc420053e58)
    /Users/willie/src/cloud.google.com/go/pubsub/pubsub.go:79 +0x2b2
main.main()
    /Users/willie/src/github.com/vitalsource/chump/chump.go:215 +0x824

goroutine 51 [select]:
google.golang.org/grpc.(*ccResolverWrapper).watcher(0xc42026e510)
    /Users/willie/src/google.golang.org/grpc/resolver_conn_wrapper.go:110 +0x1ac
created by google.golang.org/grpc.(*ccResolverWrapper).start
    /Users/willie/src/google.golang.org/grpc/resolver_conn_wrapper.go:96 +0x3f

goroutine 52 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc420274340)
    /Users/willie/src/google.golang.org/grpc/balancer_conn_wrappers.go:122 +0x156
created by google.golang.org/grpc.newCCBalancerWrapper
    /Users/willie/src/google.golang.org/grpc/balancer_conn_wrappers.go:113 +0x197

goroutine 53 [chan receive]:
google.golang.org/api/internal.(*PoolResolver).Next(0xc420276440, 0xc4202c2f68, 0x0, 0xc4202d8000, 0x8, 0x8)
    /Users/willie/src/google.golang.org/api/internal/pool.go:54 +0x54
google.golang.org/grpc.(*roundRobin).watchAddrUpdates(0xc42025a1e0, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/balancer.go:159 +0x64
google.golang.org/grpc.(*roundRobin).Start.func1(0xc42025a1e0)
    /Users/willie/src/google.golang.org/grpc/balancer.go:234 +0x2b
created by google.golang.org/grpc.(*roundRobin).Start
    /Users/willie/src/google.golang.org/grpc/balancer.go:232 +0x14f

goroutine 54 [chan receive]:
google.golang.org/grpc.(*balancerWrapper).lbWatcher(0xc42026c310)
    /Users/willie/src/google.golang.org/grpc/balancer_v1_wrapper.go:124 +0xf9
created by google.golang.org/grpc.(*balancerWrapperBuilder).Build
    /Users/willie/src/google.golang.org/grpc/balancer_v1_wrapper.go:62 +0x2f3

goroutine 19 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202ee000, 0x8, 0x0, 0xbe8e13d63061a086, 0x1bced3a320, 0x197b440, 0xbe8e13d63061a086, 0x1bced3a320, 0x197b440, 0xc42046a2c0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202ee000, 0x0, 0xc42001c360)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202ee000)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 20 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202ee500, 0x8, 0x0, 0xbe8e13d6b3be7691, 0x1c496612ea, 0x197b440, 0xbe8e13d6b3be7691, 0x1c496612ea, 0x197b440, 0xc4203e8880, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202ee500, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202ee500)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 21 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202ee780, 0x8, 0x0, 0xbe8e13d228fec601, 0x180dc3c629, 0x197b440, 0xbe8e13d228fec601, 0x180dc3c629, 0x197b440, 0xc42046a000, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202ee780, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202ee780)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 22 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202eea00, 0x8, 0x0, 0xbe8e13d6d9353dc5, 0x1c6a77bdc0, 0x197b440, 0xbe8e13d6d9353dc5, 0x1c6a77bdc0, 0x197b440, 0xc42032bcc0, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202eea00, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202eea00)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 23 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202eec80, 0x8, 0x0, 0xbe8e13d5ebc473c2, 0x1b8e9b73f5, 0x197b440, 0xbe8e13d5ebc473c2, 0x1b8e9b73f5, 0x197b440, 0xc42032bc00, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202eec80, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202eec80)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 24 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202eef00, 0x8, 0x0, 0xbe8e13d28a0045ab, 0x1865faf104, 0x197b440, 0xbe8e13d28a0045ab, 0x1865faf104, 0x197b440, 0xc4203e8740, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202eef00, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202eef00)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 25 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202ef180, 0x8, 0x0, 0xbe8e13d46924c55e, 0x1a265af243, 0x197b440, 0xbe8e13d46924c55e, 0x1a265af243, 0x197b440, 0xc42032ba00, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202ef180, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202ef180)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb

goroutine 26 [select]:
google.golang.org/grpc.(*addrConn).createTransport(0xc4202ef400, 0x8, 0x0, 0xbe8e13d2f439c111, 0x18cbcf36d0, 0x197b440, 0xbe8e13d2f439c111, 0x18cbcf36d0, 0x197b440, 0xc42032bb40, ...)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1211 +0xeaa
google.golang.org/grpc.(*addrConn).resetTransport(0xc4202ef400, 0x0, 0x0)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:1100 +0x540
google.golang.org/grpc.(*addrConn).connect.func1(0xc4202ef400)
    /Users/willie/src/google.golang.org/grpc/clientconn.go:829 +0x2f
created by google.golang.org/grpc.(*addrConn).connect
    /Users/willie/src/google.golang.org/grpc/clientconn.go:828 +0xeb
zombiezen commented 6 years ago

@dfawley Any ideas?

jba commented 6 years ago

@zombiezen Removed the WithBlock option.

@willie The first line of the first stack trace shows that the hang is in gRPC. Hence the shout-out to dfawley. I'd like to file an issue against them -- can you tell me the commit you're at?

You might also try using a numbered version (git tag) for gRPC, instead of HEAD. Looks like v1.9.x is the latest.

willie commented 6 years ago

@jba I updated grpc this morning when I was frustrated. I don't know what commit I had been being when it all stopped last night at 11:32 pm, but today, I am building against https://github.com/grpc/grpc-go/commit/1cd234627e6f392ade0527d593eb3fe53e832d4a

I'll try the tagged release.

dfawley commented 6 years ago

If the grpc goroutines are stuck at this point, this is where we backoff between connection attempts, which means we're unable to dial or auth with the remote server. Can you do export GRPC_GO_LOG_SEVERITY_LEVEL=INFO, retry, and let us know if you see any warning messages?

willie commented 6 years ago

@dfawley This is weird.

INFO: 2018/01/11 16:14:55 dialing to target with scheme: ""
INFO: 2018/01/11 16:14:55 ccResolverWrapper: sending new addresses to cc: [{pubsub.googleapis.com:443 0  <nil>}]
INFO: 2018/01/11 16:14:55 balancerWrapper: got update addr from Notify: [{pubsub.googleapis.com:443 0} {pubsub.googleapis.com:443 1} {pubsub.googleapis.com:443 2} {pubsub.googleapis.com:443 3} {pubsub.googleapis.com:443 4} {pubsub.googleapis.com:443 5} {pubsub.googleapis.com:443 6} {pubsub.googleapis.com:443 7}]
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  4}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  2}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  6}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  3}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  1}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  5}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  0}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:55 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  7}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  4}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  7}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  1}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  6}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  2}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  5}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...
WARNING: 2018/01/11 16:14:56 grpc: addrConn.createTransport failed to connect to {pubsub.googleapis.com:443 0  3}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:3128: getsockopt: connection refused". Reconnecting...

To see if I can resolve pubsub.googleapis.com, I tried this:

$ curl -i https://pubsub.googleapis.com:443
HTTP/1.1 404 Not Found
Date: Thu, 11 Jan 2018 22:17:26 GMT
Content-Type: text/html; charset=UTF-8
Server: ESF
Content-Length: 1561
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
Alt-Svc: hq=":443"; ma=2592000; quic=51303431; quic=51303339; quic=51303338; quic=51303337; quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35"
willie commented 6 years ago

UGH! @jba @dfawley, My apologies. I figured it out.

Apparently, as part of testing and rerouting traffic, we put proxy server values in the ENV. Then, the proxy server died. When I ran things locally, the setup script was also defining those proxy servers. I took the proxy server ENV values out and things work again.

My apologies for wasting your time. Thanks for the help.

jba commented 6 years ago

@willie No problem, glad everything is OK.

willie commented 6 years ago

@jba Thanks for understanding. @dfawley Thanks for the ENV for the debug info. Looking at that data triggered my memory.

Longer version: As we are moving hot data from our corporate datacenter out to GCP (as groundwork for a complete move to GCP, thank god), we put a tunnel/proxy server in between my program and pubsub (and cloud storage) so that we could take a less congested network path out of our data center. Apparently that tunnel went down and never came back up. smh. I probably need to put some kind of test into my programs that checks the health of the proxy server on startup just so I fail quickly.