Add external callbacks on script startup/shutdown

gcamori commented 5 years ago

Are there any plans to add script monitoring to the server?

At the moment I use Datadog to monitor individual scripts, but it would be awesome to have some layer of monitoring integrated in the script-server.

bugy commented 5 years ago

Hi @gcmroi, to be honest I wasn't thinking about it. All the scripts I have have nothing to monitor, they are usually small and fast.

Could you describe what kind of scripts you have and what kind of monitoring would you like to have for them?

gcamori commented 5 years ago

@bugy: my script perform a series of complex tasks, from collecting news and social media posts, to analyzing them using NLP.

Your script-server is a great launch pad to ensure they can be started and stopped from one interface, but I'd love to add a monitoring interface to each script to visualize their performance without having to resort to DataDog or similar external tool.

Integrating Prometheus or similar open source system, for example, would be a good first step, but I have no experience with it and am not finding any literature that explains how to integrate it with a Tornado app.

Cheers!

bugy commented 5 years ago

Hi @gcmroi, thanks for the detailed explanation! I'm not sure if I'd like to implement monitoring in Script server for now. But I'll have a look, how it could be integrated to some monitoring tool. Do you think Prometheus would be a good option to try?

gcamori commented 5 years ago

@bugy: no worries. Happy to add some context!

I would be happy to implement the integration myself, but I don't have experience with Tornado (my startup uses CherryPy for WSGI apps) and there's very little material out there, other than for Tornado/Flask-Django integrations.

Prometheus seems a very solid option, but I/we have not used it yet.

GC

bugy commented 5 years ago

Perfect, I'll have a look tomorrow or over the weekend.

gcamori commented 5 years ago

Great. Thank you!

bugy commented 5 years ago

Hi @gcmroi, from what I've found, Prometheus is using scrapping of metrics data from an exposed web endpoint. And there are 2 main ways to provide this metrics:

PushGateway: this gateway is storing metrics data in memory and provides it to Prometheus scraper on request. Applications have to push to this gateway explicitly, when they want to record some metrics
Exporter: this is a standalone server, which is responsible for gathering/calculating/updating/storing metrics. It also exposes these metrics vie web endpoint and the scraper just collects it.

So there should be an application, which either pushes metrics to PushGateway or runs a webserver and exposes these metrics itself. In my opinion, these application could be a standalone and not related to Script server: the latter could just notify this application about started/stopped processes. And then the application will collect/monitor processes metrics and expose them. Do you think it makes sense?

Another thing/problem is so called "jobs" in Prometheus. All the metrics should be related to these predefined jobs. However, I'm confused here: in Script server you can run the same script multiple times (probably even simultaneously). How these script executions will be shown in a single job?

yosefy commented 5 years ago

it is partially true :) prometheus can take all metrics from app which "expose" them on host:port/metrics for example

there is a library for that: https://prometheus.io/docs/instrumenting/clientlibs/

so you may use a library or just reply in "prometheus format" in http body

so when prometheus will request http://script-server.local:5000/metrics it will get all the data it gets in reply and put in its database

exporters are for software that don't "expose" metrics

i don't need metrics myself, just sharing

thanks

On Mon, Mar 25, 2019 at 11:00 AM Iaroslav Shepilov notifications@github.com wrote:

Hi @gcmroi https://github.com/gcmroi, from what I've found, Prometheus is using scrapping of metrics data from an exposed web endpoint. And there are 2 main ways to provide this metrics:

PushGateway: this gateway is storing metrics data in memory and provides it to Prometheus scraper on request. Applications have to push to this gateway explicitly, when they want to record some metrics

Exporter: this is a standalone server, which is responsible for gathering/calculating/updating/storing metrics. It also exposes these metrics vie web endpoint and the scraper just collects it.

So there should be an application, which either pushes metrics to PushGateway or runs a webserver and exposes these metrics itself. In my opinion, these application could be a standalone and not related to Script server: the latter could just notify this application about started/stopped processes. And then the application will collect/monitor processes metrics and expose them. Do you think it makes sense?

Another thing/problem is so called "jobs" in Prometheus. All the metrics should be related to these predefined jobs. However, I'm confused here: in Script server you can run the same script multiple times (probably even simultaneously). How these script executions will be shown in a single job?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bugy/script-server/issues/200#issuecomment-476107944, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVy7YEGS2WFHAMFenDwYlcL-L7-u0Vtks5vaJA7gaJpZM4cAj5e .

bugy commented 5 years ago

Hi @yosefy, thanks for the clarification! Actually I'm thinking about an exporter, which will collect data on scripts itself, and script server won't be involved into this process. So the scripts are the "software that don't expose metrics" in this case. The reason, why I'm thinking about completely separated metrics collector, is because it has the same access to script processes as script server. And there are no much reasons to couple them

yosefy commented 5 years ago

well for that exactly pushgateway exists IMHO so you don't need to

On Mon, Mar 25, 2019 at 2:54 PM Iaroslav Shepilov notifications@github.com wrote:

Hi @yosefy https://github.com/yosefy, thanks for the clarification! Actually I'm thinking about an exporter, which will collect data on scripts itself, and script server won't be involved into this process. So the scripts are the "software that don't expose metrics" in this case. The reason, why I'm thinking about completely separated metrics collector, is because it has the same access to script internals as script

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bugy/script-server/issues/200#issuecomment-476183756, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVy7Qs9XUG7RxYyxw4sQXvdiZAxVbeVks5vaMbygaJpZM4cAj5e .

gcamori commented 5 years ago

@bugy, @yosefy: thanks a lot. Your comments clarify things.

@bugy - you said:

Actually I'm thinking about an exporter, which will collect data on scripts itself, and script server won't be involved into this process. So the scripts are the "software that don't expose metrics" in this case.

That makes sense to me. I'll do some prototyping and report back for reference. All the best!

GC

bugy commented 5 years ago

Hi @gcmroi, I believe you would need some kind of trigger from the script server, once any script is started. This trigger (either HTTP or some process invocation) should be quite simple to implement. If you will need it, just say :)

gcamori commented 5 years ago

@bugy: yes, that would be great! What would the most appropriate place be for that trigger in the script-server app, in your opinion? HHTP would seem the most straightforward approach.

Thanks!

bugy commented 5 years ago

Hi, I think it could be a part of script server configuration. For example:

...
"callbacks": {
    "notifyOnStart": true,
    "notifyOnFinish": true,
    "notificationParameters": ["pid", "execution_id", "script_name", "user"],
    "destinations": [
        {
            "type": "http"
            "url": "https://my_server.com/executions-listeners"
        },
        {
            "type": "script"
            "script": "echo"
        }
    ],
}
...

gcamori commented 5 years ago

Thanks @bugy!

bugy commented 5 years ago

Done. @gcmroi, could you check if it's working for you, please?

notifyOnStart and notifyOnFinish are optional, default: true notificationParameters is optional, default: ['execution_id', 'pid', 'script_name', 'user', 'exit_code'].

Each notification will additionally have event_type field, wich can be either 'execution_started' or 'execution_finished'

Specific details about destination types:

email: the notification is sent via email, with all the fields in the form of field: value
http: the notification is sent via HTTP POST call, with json body of keys/values
script: command, specified in the configuration is called with values as positional parameters (each position corresponds to field position in notificationParameters). Additionally, there are environment variables for each key/value.

Below is my test configuration:

  "callbacks": {
     "notifyOnStart": true,
    "notifyOnFinish": true,
    "notificationParameters": ["pid", "execution_id", "script_name", "user"],
    "destinations": [
      {
        "type": "email",
        "from": "buggygm@gmail.com",
        "to": "buggygm@gmail.com",
        "server": "smtp.gmail.com",
        "password": "$$EMAIL_PWD"
      },
      {
        "type": "http",
        "url": "localhost:5000/test_alerts"
      },
      {
        "type": "script",
        "command": "python3 ./samples/scripts/callback_test.py"
      }
    ]
  }

gcamori commented 5 years ago

@bugy: I'll try it in the next day or so and report back. Thanks!

bugy / script-server

Add external callbacks on script startup/shutdown #200