CGRU / cgru

CGRU - AFANASY
http://cgru.info/
GNU Lesser General Public License v3.0
278 stars 111 forks source link

Capturing events, triggering callbacks #447

Closed qbadev closed 5 years ago

qbadev commented 5 years ago

I use Af for completely automated remote render farm and I need custom error/complete reporting on each task/job. I read a lot of Internet searching for a full-detail solution (I am not a pro in Python nor C++), but couldn't find any. I also read CGRU Afanasy Server Events section of CGRU Docs, but this is not a detailed manual. I made a quick research through py and cpp code on github. I still don't know, how events work (have only general knowledge), how to capture them or how to assign a callback.

In detail I would like to know:

timurhai commented 5 years ago

Hi. If job or user has such custom data with events object what contain event type object event will be triggered. There is some example event custom data: http://cgru.info/afanasy/server#events. Event is a common task with event custom JSON data as a command. Any service can do with a command anything. Event service designed not to run task command, but to read JSON data from it. And run any other command if needed. Event service example is here: https://github.com/CGRU/cgru/blob/master/afanasy/python/services/events.py So events (like any other Afanasy service) needs some Python knowledge for customization.

ps Yes there is a lack detailed events documentation, and not only events. Later i have a plan to create 2 documentations (2 sites). One is just for introduction/news/downloads/simple install. The other will contains detailed documentation.

qbadev commented 5 years ago

How to add custom data to a Job? Did not find it in Docs. Where to attach this JSON? Where could I find list of available events? In Docs there is only two: JOB_ERROR, JOB_DONE.

timurhai commented 5 years ago

You can add custom data on creation or via JSON protocol. But for user you can add custom data from GUI (Web at Qt). User+Job+Task custom data objects will be merged. For now there are only 2 events JOB_ERROR and JOB_DONE.

If you want to monitor any task error (any error on farm) events are not suitable for this. There is no ready solution for it. Better to write something like WebGUI - it shows jobs, tasks states - so it knows about errors, but it uses "monitor" events - designed for GUIs.

qbadev commented 5 years ago

I create Jobs via Python API. Is it possible to manage events this way as well?

JOB_ERROR is the most important to me, but I asked for other events just out of curiosity.

Don't need event when Job is done, as I add confirmation block to each Job, so my application get know about finished jobs. This approach does not work in case of errors, so I need to implement some event capturing.

Edit: I looked through CPP code and found another event JOB_DELETED, triggered the same way as JOB_DONE or JOB_ERROR

timurhai commented 5 years ago

There is no Python API special function for it. But you can do the same as for some other parameter: https://github.com/CGRU/cgru/blob/master/afanasy/python/af.py#L702

self.data["description"] = value

So you can set job data["custom_data"] parameter manually.

Why "done" approach can't work with "error"?

JOB_DELETED - was written much later (other needs/developers) and was not documented.

qbadev commented 5 years ago

Ok, so if I have a Job object, I can use data as it is not protected. Fine, that makes it quite clear.

So, my code for JOB_ERROR reporting should look like below:

Job.data['custom_data'] = {
    "email":"some@email.com",
    "events":
    {
        "JOB_ERROR":{"methods":["email"]},
        "JOB_DONE":{"methods":["email"]}
    }
}

I wrote that I don't need DONE, as I add finishing block to each Job. This finishing block named Confirmation, checks if render was fine and informs my Appllication that a Job is done.

timurhai commented 5 years ago

Yes. But "custom_data" is a string parameter. You can't pass an dict() to it. You should encode you custom data into JSON. And assign a string.

qbadev commented 5 years ago

Thanks, that sounds great. After some tests I will post example code.

timurhai commented 5 years ago

You can begin tests with print() task command in you custom events.py. ( And i should place this in a detailed documentation ) After this, things will became cleaner.

qbadev commented 5 years ago

Do You suggest using /opt/cgru/afanasy/python/services/events.py as a base for my custom_events.py? I guess I should replace code sending email with my custom code, and also changing names of methods.

I prepared a simple Python script sending job to AfServer and setting event for JOB_ERROR and JOB_DONE. I didn't send any email, I suppose I don't have my email service configured at all. Paths and imports are native to Debian environment.

#!/usr/bin/env python
import sys, os
os.environ['CGRU_LOCATION'] = '/opt/cgru'
sys.path.append('/opt/cgru/afanasy/python')
import af

job = af.Job('Test - events')

b = af.Block('test', 'system')
b.setCommand('ls -la')
b.setNumeric(1, 1, 1)
b.setErrorsTaskSameHost(1)
job.blocks.append(b)
job.data['custom_data'] = '{"email":"some@email.com","events":{"JOB_ERROR":{"methods":["email"]},"JOB_DONE":{"methods":["email"]}}}'
job.send()
timurhai commented 5 years ago

I think there is no reason to inherit that class. Its just an example. You should rewrite it.

Better to copy it to some custom_events.py and set: https://github.com/CGRU/cgru/blob/master/afanasy/config_default.json#L188 to "af_sysjob_events_service":"custom_events.py",

Also you can create custom_events.png icon for GUIs here: https://github.com/CGRU/cgru/tree/master/icons/software

timurhai commented 5 years ago

Also better not to modify cgru/afanasy/config_default.json but to create cgru/config.json and place there only changed parameters.

qbadev commented 5 years ago

I already use customized config.json in /home/user/.cgru/. Didn't ever modify config_default. I created events_custom.py and set config for server as You suggested. Will inform after tests. But either way, thanks in advance.

qbadev commented 5 years ago

It does not work, still tries to send emails.

I think I misunderstand the point. Where should this config be applied? To server or user client machine requesting job or maybe to each AfRender machine (as it is not known which node will do the event task)?

timurhai commented 5 years ago

That config directive is for afserver. You should ensure that afserver read that config. And restart afserver. You can check your system job block services from GUIs. Also you should ensure that your farm.json can run custom_events on some renders.

qbadev commented 5 years ago

It seems I had no config.json in cgru folder. I thought it is enough to have config.json in my user home folder. After some tests it looks like, events are not triggered by server machine, but by job-request-sender. But it also may be random, as I have 3 render nodes (one of render nodes is also a server).

timurhai commented 5 years ago

The result of an event is a task in a system job. System job is distributes its task like any other job. So, for example, you set specific hosts mask on events block, if you want events to be executed on a specific machine. Or/And you can configure events service in farm.json.

qbadev commented 5 years ago

How to use farm.json? Could find it in docs... Only farm_example.json is kind of manual but not sufficient.

How to add hosts mask to event task? I already tried self.taskInfo['hosts'].append('hostname') and self.taskInfo['hosts_mask'] = 'hostname' in events.py, but without success. I found out that 'hosts' is for other purpose, and 'hosts_mask' is not implemented in service.py at all.

timurhai commented 5 years ago

Hi. You can forget about farm.json, as in 2.4.x farm setup will be via pools, no old farm setup will be used. Hosts mask has user, job and job blocks, you can manipulate this parameter from a GUIs.

qbadev commented 5 years ago

I don't use GUI at all, so I need other way via Python or shell command or maybe I will have to forget about it. Where can I set up afadmin user? Is there any config file for this user?

timurhai commented 5 years ago

Any GUI uses JSON protocol, and Python API too. There is only the one way to affect afanasy server - JSON protocol. You can use GUI log to see what it sends to change something. Then you send such JSON in any other way. Also there are JSON examples here: https://github.com/CGRU/cgru/tree/master/examples/json Some example from there: https://github.com/CGRU/cgru/blob/master/examples/json/action_jobs_params.json '-' character before name used a comment. To change hosts_mask of a system job you can use its ID, it equals 1 for system job:

{
"action":
{
    "user_name"  : "jimmy",
    "host_name"  : "pc01",
    "type"       : "jobs",
    "ids"        : [1],
    "params"     :
    {
        "hosts_mask"  : "render.*"
    }
}
}
timurhai commented 5 years ago

Any user manipulation is the same.

qbadev commented 5 years ago

If I get You right, I can simply do getJobInfo via Python with Job ID = 1, then iterate through blocks finding block named events, add hosts_mask to this block, and then what? How to save it to server? Is it OK to send JSON command to server using afnetwork.py?

Will give a try to GUI soon. Thanks for all Your support.

timurhai commented 5 years ago

Yes, you can send data to server using afnetwork.py

qbadev commented 5 years ago

Finally it works, but for documentation: fields user_name, host_name, type and ids or mask must not be empty in input JSON. Also, input JSON must be a one-liner (tried to use multi-line JSON and it was a fail). It is also good practice to check af_server.log.

@timurhai Thank You for all Your patience and effort to help me. Result is outstanding, my render farm can still be remote and fully automated, controlled via CLI and Python API, and all credits go to Afanasy.

timurhai commented 5 years ago

Great!