strongloop / strong-supervisor

Application supervisor that automatically adds cluster control and performance monitoring with StrongOps
Other
66 stars 20 forks source link

Watcher/general metrics #229

Closed kjdelisle closed 7 years ago

kjdelisle commented 7 years ago

Cherry-picked from:

Additional linting fixes included.

sam-github commented 7 years ago

the problem is that this watcher assumed appmetrics was always there, and it was the only user. this isn't necessarily true. I think that the watcher will somehow have to ask the adapter if it is started, and if it is, then send the metrics up, or that the the metrics emission will have to occur from lib/adatper.js, and only the relaying up will happen in the watcher. This is the kind of complexity that made me hope that it could be done in wlpn-cli-server, but you found it can't.

kjdelisle commented 7 years ago

@slnode test please getting errors about not being able to find stats-lite (but it passes on Travis CI...)

sam-github commented 7 years ago

What is stats-lite? Its not a dep of strong-supervisor, even indirectly.

Can you test manually with sl-run --no-profile .., and if that passes, this LGTM (but tests do need to pass).

kjdelisle commented 7 years ago

@sam-github stats-lite is a devDependency of stats-incremental, which is a dependency of strong-trace.

kjdelisle commented 7 years ago

I can confirm that it does fix the issue, and that metrics are reaching the collective controller with this patch in place.

rmg commented 7 years ago

It is passing on Travis because there is no license key in the environment so the tests that are failing on cis-jenkins are being skipped there. That is also probably how it is passing for you locally.

rmg commented 7 years ago

The failure appears to have been introduced very recently. PR submitted upstream: brycebaril/stats-incremental#1

rmg commented 7 years ago

@slnode test please

rmg commented 7 years ago

stats-incremental was fixed upstream, the remaining failure looks like it might be node 8 specific.

kjdelisle commented 7 years ago

The failures I'm seeing are somewhat racy. Travis CI is getting intermittent timeouts at various parts of the tests. Case-in-point, re-running the suite has made it go green as of this posting. I'll give it another spin. @slnode test please

kjdelisle commented 7 years ago

@slnode test please

kjdelisle commented 7 years ago

Tinkering with the timeout; it might need more time to finish. For posterity, I'll mention that I'm not a fan of messing with test timeouts as it's usually a band-aid for a bigger problem, but we'll see.

kjdelisle commented 7 years ago

@sam-github @rmg The test passes with the timeout removed completely, but it won't work with a timeout of 360 seconds. Is 300 seconds a hard limit, or do you have a different discrete timeout you'd be comfortable with?

kjdelisle commented 7 years ago

@slnode test please

kjdelisle commented 7 years ago

One more try, just to make sure it's not a one-off fluke.

kjdelisle commented 7 years ago

@slnode test please