Closed mattcamp closed 2 months ago
Looks overall good. What happens if Grafana or Influx-DB is not running if it is enabled? Would it be good to add some kind of pre-requisite of Grafana to the training? Or ensure it is started?
The initial metric to Telegraf is UDP, so it's effectively just blindly fired at a UDP port. If the telegraf container isn't running then there aren't any errors, it'll just fail to work. Such are the joys of UDP. This can make it a pain to debug if things aren't set up correctly, but also means near zero risk of breaking Robomaker, even with a totally misconfigured setup. Worst you should get is a DNS lookup error if you put something strange in for DR_TELEGRAF_HOST. But in nearly 100% of cases just setting it to telegraf
should work fine, as long as the telegraf container is in the same docker network as robomaker.
If telegraf is running but Influx isn't then telegraf will error on container start as it verifies the Influx connection.
Grafana is just a presentation layer above influx for dashboards. If Influx isn't running then it will report a datasource error.
Only telegraf+influx are required to collect and store metrics.
This PR adds a docker-compose stack which launches three additional services
The feature is enabled by uncommenting
DR_TELEGRAF_HOST
andDR_TELEGRAF_PORT
in system.env, which will be passed to Robomaker.The Telegraf/Influxdb/Grafana stack can be started using
dr-start-influxdb
, after which the Grafana web UI can be accessed on port 3000.Inherently this PR won't enable any additional metrics but is a pre-requisite to receive metrics from the updated robomaker via this PR