couchbaselabs / cbft

*THIS PROJECT HAS MOVED* from couchbaselabs TO: https://github.com/couchbase/cbft -- no further development will be done here on couchbaselabs/cbft
Other
27 stars 5 forks source link

unusual continuous polling of pools and related URLs #160

Open mschoch opened 9 years ago

mschoch commented 9 years ago

I notice this sometimes when cbft is started from ns_server. It seems to NOT happen when I first create an index. But rather, after I kill ns_server (and thus everything else) and later restart it. It seems like CBFT isn't happy with something it sees, so it keeps retrying. It burns a lot of cpu, as I eventually hear the fan.

On the console I see:

 [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default?uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 3543 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets?v=6308748&uuid=121437bb7291265fab05a67e02271b66 HTTP/1.1" 200 9703 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools/default/buckets/beer-sample?bucket_uuid=0d364688bea961ffda58dc7aee9ad648 HTTP/1.1" 200 9701 - Go 1.1 package http
127.0.0.1 - beer-sample [29/Jul/2015:09:32:51 -0400] "GET /pools HTTP/1.1" 200 794 - Go 1.1 package http
mschoch commented 9 years ago

Seems like the exponential backoff function in cbdatasource is the most likely place. Still digging, any thoughts @steveyen ?

mschoch commented 9 years ago

While it was repeatable for a while... as I added code to try and track it down, it has now stopped happening...

mschoch commented 9 years ago

Also, I'm not entirely sure cbft is to blame. It seems like it could be anything using go-couchbase (or a fork of go-couchbase)

steveyen commented 9 years ago

Yeah, that logging message text of "Go 1.1 package http" is throwing me off a bit.

When I try it on my standalone dev build of cbft (where I point it at couchbase, create a bleve index, kill ns-server), it works ok without spamming cbft's log output. But that's just using the simple Cfg provider.

Perhaps metakv? (Still, why would it be go 1.1?)

btw, one way to find out your go version from cbft is via...

curl http://localhost:8095/api/runtime

Some limited docs on those diagnostic REST endpoints in cbft are here: http://labs.couchbase.com/cbft/api-ref/#node-diagnostics

One final clue is to figure out who logs/printf's in that kind of format. cbft's logs currently normaly look like...

2015/07/29 07:57:13 janitor: adding pindex: bs_2dd16fba4b47b833_ecb637fa
mschoch commented 9 years ago

I have looked into the odd User-Agent. That is what Go sends by default. It does NOT correspond to the Go version.

Regarding these messages, this is ns_server logging the requests coming in.

mschoch commented 9 years ago

I have separately opened up #161 because cbft messages don't end up anywhere.

mschoch commented 9 years ago

OK, I've added a line "refreshing cluster" and confirmed this is where this happening.

Its easiest to repeat when I create index, let it finish, kill all servers. Wait 1 minute, then bring everything backup.

It seems like it might get unstuck eventually.

NOTE: it seems like about 50 or so of these messages in a burst is NORMAL. But, what I'm reporting is continuous stream of them for > 1 minute.

mschoch commented 9 years ago

I just checked, in the most recent version of this, the cluster was refreshed 1927 times before it stopped.

steveyen commented 9 years ago

Taking a look at the cbdatasource codepaths around refreshCluster, there's no obvious smoking sore thumb sticking out.

One thought, though, there's no "de-bounce"... that is if there's a storm of refreshCluster requests, they all get run. Instead, concurrent refreshCluster requests that are all piled up on the channel should be all coallesced and cleared by the next actual refreshCluster.

Another thought, was metakv being used? If so, perhaps the brand new CfgMetaKV codepaths need some shake out. In particular, perhaps the cfg change subscription codepaths might be fruitful.

Finally, the next time this happens, grabbing some of the stats counters would be helpful, which can be done via...

curl http://localhost:9200/api/stats

...where 9200 is the cbft PORT number.