cargomedia / puppet-packages

UNMAINTAINED. Reusable puppet modules for Debian
MIT License
7 stars 16 forks source link

Still old copperegg collectors around #918

Closed ppp0 closed 9 years ago

ppp0 commented 9 years ago

Update was done in #911

Copperegg complains about outdated collectors for:

ppp0 commented 9 years ago

Must be Copperegg again...

I run this on pulsar sk production shell

cap> /usr/local/revealcloud/revealcloud -V 2>&1 | grep Version
 ** [out :: www3.example.com (10.55.40.166)] Version: v3.3-116-g92fd794
 ** [out :: www5.example.com (10.55.40.138)] Version: v3.3-116-g92fd794
 ** [out :: ads2.example.com (10.55.40.152)] Version: v3.3-116-g92fd794
 ** [out :: search2.example.com (10.55.40.156)] Version: v3.3-116-g92fd794
 ** [out :: wowza2.example.com (10.55.40.154)] Version: v3.3-116-g92fd794
 ** [out :: rep1-db3.mongodb.example.com (10.55.40.150)] Version: v3.3-116-g92fd794
 ** [out :: db3.example.com (10.55.40.164)] Version: v3.3-116-g92fd794
 ** [out :: stream2.example.com (10.55.40.158)] Version: v3.3-116-g92fd794
 ** [out :: db4.example.com (10.55.40.162)] Version: v3.3-116-g92fd794
 ** [out :: db5.example.com (10.55.40.130)] Version: v3.3-116-g92fd794
 ** [out :: www4.example.com (10.55.40.160)] Version: v3.3-116-g92fd794
 ** [out :: mms2.mongodb.example.com (10.55.40.137)] Version: v3.3-116-g92fd794
 ** [out :: rep1-db4.mongodb.example.com (10.55.40.142)] Version: v3.3-116-g92fd794
 ** [out :: rep2-db3.mongodb.example.com (10.55.40.148)] Version: v3.3-116-g92fd794
 ** [out :: rep2-db4.mongodb.example.com (10.55.40.146)] Version: v3.3-116-g92fd794
 ** [out :: config1.mongodb.example.com (10.55.40.136)] Version: v3.3-116-g92fd794
 ** [out :: rep1-arbiter1.mongodb.example.com (10.55.40.133)] Version: v3.3-116-g92fd794
 ** [out :: rep2-arbiter1.mongodb.example.com (10.55.40.134)] Version: v3.3-116-g92fd794
 ** [out :: pulsar1.example.com (10.55.40.140)] Version: v3.3-116-g92fd794
 ** [out :: config2.mongodb.example.com (10.55.40.132)] Version: v3.3-116-g92fd794
 ** [out :: config3.mongodb.example.com (10.55.40.135)] Version: v3.3-116-g92fd794
ppp0 commented 9 years ago

sudo monit restart revealcloud everywhere... may be?

ppp0 commented 9 years ago

did not help

org production for completeness sake:

 ** [out :: ci3.cargomedia.ch (148.251.139.146)] Version: v3.3-116-g92fd794
 ** [out :: ci2.cargomedia.ch (148.251.68.165)] Version: v3.3-116-g92fd794
 ** [out :: backup1.example.com (144.76.220.5)] Version: v3.3-116-g92fd794
 ** [out :: puppet1.cargomedia.ch (178.63.97.21)] Version: v3.3-116-g92fd794

Closing as it seems to be some bug with CE

Reopen if you disagree

njam commented 9 years ago

Do we have some old agents running?

cap> pgrep -f 'revealcloud' |wc -l
 ** [out :: wowza2.example.com (10.55.40.154)] 7
 ** [out :: search2.example.com (10.55.40.156)] 3
 ** [out :: ads2.example.com (10.55.40.152)] 9
 ** [out :: config2.mongodb.example.com (10.55.40.132)] 3
 ** [out :: rep1-arbiter1.mongodb.example.com (10.55.40.133)] 5
 ** [out :: stream2.example.com (10.55.40.158)] 5
 ** [out :: db5.example.com (10.55.40.130)] 3
 ** [out :: db4.example.com (10.55.40.162)] 7
 ** [out :: config3.mongodb.example.com (10.55.40.135)] 3
 ** [out :: rep1-db3.mongodb.example.com (10.55.40.150)] 5
 ** [out :: rep2-db3.mongodb.example.com (10.55.40.148)] 9
 ** [out :: www4.example.com (10.55.40.160)] 11
 ** [out :: rep2-db4.mongodb.example.com (10.55.40.146)] 5
 ** [out :: www3.example.com (10.55.40.166)] 3
 ** [out :: rep2-arbiter1.mongodb.example.com (10.55.40.134)] 3
 ** [out :: db3.example.com (10.55.40.164)] 13
 ** [out :: www5.example.com (10.55.40.138)] 3
 ** [out :: pulsar1.example.com (10.55.40.140)] 3
 ** [out :: mms2.mongodb.example.com (10.55.40.137)] 3
 ** [out :: rep1-db4.mongodb.example.com (10.55.40.142)] 3
 ** [out :: config1.mongodb.example.com (10.55.40.136)] 5
ppp0 commented 9 years ago

indeed, I had to

cap> sudo pkill -9 -e reveal

Now it looks better

cap> pgrep -c revealcloud
 ** [out :: db4.example.com (10.55.40.162)] 2
 ** [out :: www5.example.com (10.55.40.138)] 2
 ** [out :: www3.example.com (10.55.40.166)] 2
 ** [out :: www4.example.com (10.55.40.160)] 2
 ** [out :: rep1-db3.mongodb.example.com (10.55.40.150)] 2
 ** [out :: stream2.example.com (10.55.40.158)] 2
 ** [out :: mms2.mongodb.example.com (10.55.40.137)] 2
 ** [out :: rep1-arbiter1.mongodb.example.com (10.55.40.133)] 2
 ** [out :: config3.mongodb.example.com (10.55.40.135)] 2
 ** [out :: wowza2.example.com (10.55.40.154)] 2
 ** [out :: ads2.example.com (10.55.40.152)] 2
 ** [out :: config1.mongodb.example.com (10.55.40.136)] 2
 ** [out :: search2.example.com (10.55.40.156)] 2
 ** [out :: rep1-db4.mongodb.example.com (10.55.40.142)] 2
 ** [out :: rep2-db3.mongodb.example.com (10.55.40.148)] 2
 ** [out :: rep2-db4.mongodb.example.com (10.55.40.146)] 2
 ** [out :: rep2-arbiter1.mongodb.example.com (10.55.40.134)] 2
 ** [out :: db5.example.com (10.55.40.130)] 2
 ** [out :: db3.example.com (10.55.40.164)] 2
 ** [out :: config2.mongodb.example.com (10.55.40.132)] 2
 ** [out :: pulsar1.example.com (10.55.40.140)] 2
ppp0 commented 9 years ago

reopen if you disagree

njam commented 9 years ago

Let's hope we don't pay or multiple collectors ;)

ppp0 commented 9 years ago

@njam if this is the case one could suspect CE of having an interest in running as many collectors as possible, ie making them non-killable

I reopened because I noticed our Hetzner nodes (ci2, ci3, backup1 - but not puppet1?) are again running old versions.. investigating/fixing

ppp0 commented 9 years ago
cap> sudo pkill -9 -e reveal
cap> pgrep -c revealcloud
[out :: ci3.cargomedia.ch (148.251.139.146)] 2
 ** [out :: staging1.cargomedia.ch (mandrill.cargomedia)] 0
 ** [out :: ci2.cargomedia.ch (148.251.68.165)] 2
 ** [out :: puppet1.cargomedia.ch (178.63.97.21)] 2
 ** [out :: backup1.example.com (144.76.220.5)] 2

looks better now