eBay / nvidiagpubeat

nvidiagpubeat is an elastic beat that uses NVIDIA System Management Interface (nvidia-smi) to monitor NVIDIA GPU devices and can ingest metrics into Elastic search cluster, with support for both 6.x and 7.x versions of beats. nvidia-smi is a command line utility, based on top of the NVIDIA Management Library (NVML), intended to aid in the management and monitoring of NVIDIA GPU devices.
https://github.com/eBay/nvidiagpubeat
Apache License 2.0
54 stars 22 forks source link

Exit out if nvidia-smi is not present in PATH for production mode. #26

Open deepujain opened 5 years ago

deepujain commented 5 years ago
  1. nvidiagpubeat can be run in production mode.
  2. In case nvidia-smi is not in PATH, then throw appropriate error message and exit

Possible solution

  1. gpu.go command() must return err object for above scenario. 2.metrics.go must check on error and work accordingly.
    
    if err != nil return err ```

Ex

2019-09-04T12:32:57.008-0700    INFO    instance/beat.go:400    nvidiagpubeat start running.
2019-09-04T12:32:57.008-0700    INFO    beater/nvidiagpubeat.go:57  nvidiagpubeat is running for ** production ** environment. ! Hit CTRL-C to stop it.
2019-09-04T12:32:58.038-0700    ERROR   nvidia/gpu.go:52    E! nvidia-smi is not in PATH. Exit !!!