Open farley13 opened 8 years ago
Update: It looks like prometheus is not monitoring the actual instances right now - but just the load balancer. We should be monitoring all the applications/VMs deployed.
Do we need to monitor all 4 VM instance (2 from TEST cluster, 2 from PROD cluster)? One instance from each cluster is always down when other is runing. It is a restriction of ECS
If your task uses fixed host port mapping (for example, your task uses port 80 on the host for a web server), you must have at least one container instance per task, because only one container can use a single host port at a time. You should add container instances to your cluster or reduce your number of desired tasks.
Also, can Prometheus monitor the remaining bits of our deployment?
I think we should monitor all the available instances - we can create a rule that at least one is up in the prometheus config afterwards (with a longer duration as well).
In theory prometheus can read a number of standard performance counters
https://prometheus.io/docs/instrumenting/exporters/
I don't have any personal experience with these - so if its easy we can use them for the DB, redis, solr and the load balancer. If they turn out to be a pain - we can push the integrations later. I think getting all the machines into prometheus though would be a requirement - as we spin them up/shut them down.
https://prometheus.io/docs/operating/configuration/#
I've added monitoring for PostgresSQL and Redis. It's weird, but it seems there no Solr exporter.
Also, how do you think to monitor the load balancer?
Interesting - I think https://github.com/prometheus/cloudwatch_exporter is probably what we'd use for the AWS load balancer . for Solr - I think using the built-in JMX is the way to go: https://wiki.apache.org/solr/SolrJmx https://github.com/prometheus/jmx_exporter
On Wed, Nov 9, 2016 at 8:57 PM, akarasev notifications@github.com wrote:
I've added monitoring for PostgresSQL and Redis. It's weird, but it seems there no Solr exporter.
Also, how do you think to monitor the load balancer?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/TransparentWorld/Issues/issues/10#issuecomment-259581511, or mute the thread https://github.com/notifications/unsubscribe-auth/ABYh1EOW2m-GuJhoSTPg3GlWdyze4D_6ks5q8nn-gaJpZM4J1ZpL .
This nearly looks good - will take a look this week at https://prometheus.io/docs/operating/configuration/#
It's great that we have prometheus monitoring the main app - we should extend the config to monitor the remaining bits of our deployment.