WIPACrepo / pyglidein

Some python scripts to launch HTCondor glideins
MIT License
6 stars 20 forks source link

Monitoring from client.py #110

Closed gonzalomerino closed 6 years ago

gonzalomerino commented 6 years ago

The scope of this ticket is to add code that allows us to gather monitoring metrics previous to glideins starting. These can be metrics gathered by client.py, and sent back to server.py via socket (or whatever... TBD). Then, we want to gather all the information centrally, and probably insert it in elasticsearch or graphite.

Examples of metrics we can monitor here:

dsschult commented 6 years ago

I'd send this back to the server over the same jsonrpc channel we make all other requests, then let the server talk to graphite/ES. We already do send some really basic queue information at the end of client.py

hskarlupka commented 6 years ago

@gonzalomerino

For resources required by submitted glideins (mem, cpu, gpu ...) Does this mean you want to know the amount of memory, cpu, and gpu resources requested by each glidein job at the time of submission on the client?

hskarlupka commented 6 years ago
import time
import socket
def collect_metric(name, value, timestamp):
    sock = socket.socket()
    sock.connect( ("localhost", 2003) )
    sock.send("%s %d %d\n" % (name, value, timestamp))
    sock.close()

def now():
    int(time.time())

collect_metric("metric.name", 42, now())
hskarlupka commented 6 years ago

@gonzalomerino and I talked some more about this ticket yesterday. These are the conclusions we came to:

  1. Send metrics to graphite instead of ElasticSearch
  2. Calculate values on the client
  3. Send these metrics: Number of jobs submitted Number of jobs running Number of jobs idle Max time of idle job in queue Avg time of idle job in queue Min time of idle job in queue
dsschult commented 6 years ago

Note that if you're sending metrics to graphite, you need to be aware of the binning and how that gets handled. I've seen some really ugly corner cases happen.

Also, I point to statsd as a nice interposer, especially for the gauge metric.