ros-infrastructure / buildfarm_deployment

Apache License 2.0
30 stars 39 forks source link

Drop support for New Relic Server #163

Open tfoote opened 6 years ago

tfoote commented 6 years ago

We've been using newrelic server integration for monitoring the buildfarm computers. With the coming EOL of New Relic Server integration we no longer have a free service to monitor the machines status. https://discuss.newrelic.com/t/important-upcoming-changes-for-new-relic-servers-and-legacy-alerting-features/49474

We should generally disable our configs to remove the deployment of the newrelic server code. But I think since the service is EOL we should actually remove the puppet modules from our deployment config since they won't be useful anymore.

If someone wants to add support for the replacement service New Relic Infrastructure that would be fine, but I'd suggest that we not actively pursue that.

nuclearsandwich commented 6 years ago

If someone wants to add support for the replacement service New Relic Infrastructure that would be fine, but I'd suggest that we not actively pursue that.

Knowing this was coming I had started working on adding https://my-netdata.io/ for the realtime monitoring but couldn't stabilize it before the migration. It's not designed to store data beyond the immediate history. I'm not overly concerned with storing basic monitoring data long term and would instead like to focus future efforts on application level instrumentation.

The EOL is May and I would like to have netdata up and running before then.

tfoote commented 6 years ago

Great. Having the recent history/monitoring is by far the most valuable.

Unfortunately we're not paying for the service so our EOL is next month: "For customers who do not have a paid subscription to one or more of our products, the EOL date will be November 14, 2017"

nuclearsandwich commented 6 years ago

Unfortunately we're not paying for the service so our EOL is next month

welp maybe I can get it up and running as I work on the single-host changes. On the jenkins master host, I can install it manually at any time as it will need to be installed manually anyway since puppet cannot re-run there (#160) that can probably be done at any time.

nuclearsandwich commented 6 years ago

New Relic disabled the apt repo this module used in a way causes failures when running apt update. So this is now a bug as well as a lack of observability on hosts.