Track peak memory usage and average CPU usage

lithorus commented 6 years ago

I thought about adding a new data field to tasks for collecting the average CPU usage and peak memory usage. When running multiple jobs on a single node it can help calculating the effiency and relative render time. Generally really good for debugging.

I was thinking about getting the data within the task log parser which makes it more flexible. The data can be gathered eg. /usr/bin/time -f "%P %M" pre-pended to the command in Linux. Not sure about other OS's by I would imagine that they have similar tools.

timurhai commented 6 years ago

Hi. I have thinking about resources gathering and storing. Gatheting should be flexible and it can vary. Stroring is more concrete. But is should be flexible too. There should be a way to store anything, not only MEM and CPU. So consumed resources parameter should be a string, i think. User will gather resources in its own way, and fill some string fileds in its own way. For example: "CPU=99;MEM=55" or may be json: {"cpu":99,"mem":55}. And user should think up what to do with his own formatted string.

lithorus commented 6 years ago

If it's JSON I think that postgres has some features to use the data as rows.

sebastianelsner commented 6 years ago

Just to throw the idea out there: We are internally using "InfluxDB" and a custom parser.py, that uses "psutil" to store these info per task in the influx server. It works pretty well, but is heavily experimental at the moment.

lithorus commented 6 years ago

Yes, InfluxDB is perfect for this but also a time series database which is not ideal for use with general job statistics.

If the parser class has info about the pid of the task, the total memory usage can be collected/calculated while the process is running and stored in a InfluxDB.

lithorus commented 6 years ago

Also, if it's done by the parser it can prepend the stats to each line which would be REALLY usefull.

lithorus commented 6 years ago

I wrote a small example parser :

# -*- coding: utf-8 -*-

from parsers import parser
import psutil
from datetime import datetime, timedelta

class memtest(parser.parser):
    def __init__(self):
        parser.parser.__init__(self)
        self.lastMemUpdate = None
        self.totalMem = 0
        self.totalSwap = 0

    def do(self, data, mode):
        if self.pid != 0:
            if self.lastMemUpdate is None or datetime.now() - self.lastMemUpdate > timedelta(seconds=5):
                self.lastMemUpdate = datetime.now()
                p = psutil.Process(self.pid)

                parentMeminfo = p.memory_full_info()
                self.totalMem = parentMeminfo.rss
                self.totalSwap = parentMeminfo.swap
                for child in p.children(True):
                    childMemInfo = child.memory_full_info()
                    self.totalMem += childMemInfo.rss
                    self.totalSwap += childMemInfo.swap
        output = []
        for line in data.replace("\r", "\n").split("\n"):
            if len(line):
                output.append("[MEM:{:_>7.0f}MB,SWAP:{:_>7.0f}MB]{}".format(self.totalMem / (1024 * 1024), self.totalSwap / (1024 * 1024), line))
        return "\n".join(output)

Tested it on the vraybenchmark and got the followin result:

[MEM:349MB,SWAP:__0MB]Starting V-Ray Benchmark... [MEM:349MB,SWAP:__0MB]AMD Ryzen 5 1600 Six-Core Processor, # of logical cores: 12 [MEM:349MB,SWAP:__0MB]NVIDIA driver version: 384.111 [MEM:349MB,SWAP:__0MB]Ubuntu 17.10 [MEM:349MB,SWAP:__0MB]V-Ray 3.57.01 [MEM:349MB,SWAP:__0MB]Preparing to render on CPU... [MEM:349MB,SWAP:__0MB]Now rendering...
[MEM:_1075MB,SWAP:__0MB]Rendered 0%
[MEM:_1075MB,SWAP:__0MB]Rendered 0%
[MEM:_1075MB,SWAP:__0MB]Rendered 0%
[MEM:_1075MB,SWAP:__0MB]Rendered 0%
[MEM:_1075MB,SWAP:__0MB]Rendered 1%
[MEM:_1075MB,SWAP:__0MB]Rendered 1%
[MEM:_1075MB,SWAP:__0MB]Rendered 1%
[MEM:_1075MB,SWAP:__0MB]Rendered 1%
[MEM:_1075MB,SWAP:__0MB]Rendered 2%
[MEM:_1075MB,SWAP:__0MB]Rendered 2%
[MEM:_1075MB,SWAP:__0MB]Rendered 2%
[MEM:_1075MB,SWAP:__0MB]Rendered 2%
[MEM:_1075MB,SWAP:__0MB]Rendered 2%
[MEM:_1086MB,SWAP:__0MB]Rendered 3%
[MEM:_1086MB,SWAP:__0MB]Rendered 3%
[MEM:_1086MB,SWAP:__0MB]Rendered 3%
[MEM:_1086MB,SWAP:__0MB]Rendered 3%
[MEM:_1086MB,SWAP:__0MB]Rendered 4%
[MEM:_1086MB,SWAP:__0MB]Rendered 4%
[MEM:_1086MB,SWAP:__0MB]Rendered 4%
[MEM:_1086MB,SWAP:__0MB]Rendered 5%
[MEM:_1086MB,SWAP:__0MB]Rendered 5%
[MEM:_1086MB,SWAP:__0MB]Rendered 6%
[MEM:_1086MB,SWAP:__0MB]Rendered 6%
[MEM:_1086MB,SWAP:__0MB]Rendered 6%
[MEM:_1091MB,SWAP:__0MB]Rendered 6%
[MEM:_1091MB,SWAP:__0MB]Rendered 7%
[MEM:_1091MB,SWAP:__0MB]Rendered 7%

The reason for underscores was to make it align better in eg. afwatch..

timurhai commented 3 years ago

Is it really necessary to track mem and cpu usage occupied by the task only? Afrender already knows the entire machine resources. Machine resources ~ task resources, in most cases, it all needed cases on render farm. Machine resources != task resources, when machine is used by some process / user or when afrender runs several tasks at once. But when we want to measure resources, it is usually a heavy task. No one is logged in, and it is the one task on the machine.

lithorus commented 3 years ago

Yes, it is to help debug why eg. a particluar frame is heavy. Sometimes the mem usage can spike in the middle of the render (eg. translating geomertry). Also when rendering nuke comps we're running multiple tasks at the same time because it's so bad at threading.

timurhai commented 3 years ago

For now resources string can be collected somehow. It will be passed to afserver with task update. Server will dispatch it to GUI's with task progress, and push it to DB tasks statistics table.

I think that render should collect some resources, than pass it to parser. Parser can add something to it, that render can't, for example total triangles count or so. Later it goes to statistics and GUI's. GUI's should display it somehow (we should think it up). There should be a fast way to find out peak memory (max for all job tasks and for each task) and CPU utilization (average for job tasks and for each task).

timurhai commented 3 years ago

I think that a dict() should be passed to parse function. This case we should not fix all parser classes on each parameter addition.

CGRU / cgru

Track peak memory usage and average CPU usage #400