Closed gcamori closed 5 years ago
Hi @gcmroi, to be honest I wasn't thinking about it. All the scripts I have have nothing to monitor, they are usually small and fast.
Could you describe what kind of scripts you have and what kind of monitoring would you like to have for them?
@bugy: my script perform a series of complex tasks, from collecting news and social media posts, to analyzing them using NLP.
Your script-server is a great launch pad to ensure they can be started and stopped from one interface, but I'd love to add a monitoring interface to each script to visualize their performance without having to resort to DataDog or similar external tool.
Integrating Prometheus or similar open source system, for example, would be a good first step, but I have no experience with it and am not finding any literature that explains how to integrate it with a Tornado app.
Cheers!
Hi @gcmroi, thanks for the detailed explanation! I'm not sure if I'd like to implement monitoring in Script server for now. But I'll have a look, how it could be integrated to some monitoring tool. Do you think Prometheus would be a good option to try?
@bugy: no worries. Happy to add some context!
I would be happy to implement the integration myself, but I don't have experience with Tornado (my startup uses CherryPy for WSGI apps) and there's very little material out there, other than for Tornado/Flask-Django integrations.
Prometheus seems a very solid option, but I/we have not used it yet.
GC
Perfect, I'll have a look tomorrow or over the weekend.
Great. Thank you!
Hi @gcmroi, from what I've found, Prometheus is using scrapping of metrics data from an exposed web endpoint. And there are 2 main ways to provide this metrics:
So there should be an application, which either pushes metrics to PushGateway or runs a webserver and exposes these metrics itself. In my opinion, these application could be a standalone and not related to Script server: the latter could just notify this application about started/stopped processes. And then the application will collect/monitor processes metrics and expose them. Do you think it makes sense?
Another thing/problem is so called "jobs" in Prometheus. All the metrics should be related to these predefined jobs. However, I'm confused here: in Script server you can run the same script multiple times (probably even simultaneously). How these script executions will be shown in a single job?
it is partially true :) prometheus can take all metrics from app which "expose" them on host:port/metrics for example
there is a library for that: https://prometheus.io/docs/instrumenting/clientlibs/
so you may use a library or just reply in "prometheus format" in http body
so when prometheus will request http://script-server.local:5000/metrics it will get all the data it gets in reply and put in its database
exporters are for software that don't "expose" metrics
i don't need metrics myself, just sharing
thanks
On Mon, Mar 25, 2019 at 11:00 AM Iaroslav Shepilov notifications@github.com wrote:
Hi @gcmroi https://github.com/gcmroi, from what I've found, Prometheus is using scrapping of metrics data from an exposed web endpoint. And there are 2 main ways to provide this metrics:
- PushGateway: this gateway is storing metrics data in memory and provides it to Prometheus scraper on request. Applications have to push to this gateway explicitly, when they want to record some metrics
- Exporter: this is a standalone server, which is responsible for gathering/calculating/updating/storing metrics. It also exposes these metrics vie web endpoint and the scraper just collects it.
So there should be an application, which either pushes metrics to PushGateway or runs a webserver and exposes these metrics itself. In my opinion, these application could be a standalone and not related to Script server: the latter could just notify this application about started/stopped processes. And then the application will collect/monitor processes metrics and expose them. Do you think it makes sense?
Another thing/problem is so called "jobs" in Prometheus. All the metrics should be related to these predefined jobs. However, I'm confused here: in Script server you can run the same script multiple times (probably even simultaneously). How these script executions will be shown in a single job?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bugy/script-server/issues/200#issuecomment-476107944, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVy7YEGS2WFHAMFenDwYlcL-L7-u0Vtks5vaJA7gaJpZM4cAj5e .
Hi @yosefy, thanks for the clarification! Actually I'm thinking about an exporter, which will collect data on scripts itself, and script server won't be involved into this process. So the scripts are the "software that don't expose metrics" in this case. The reason, why I'm thinking about completely separated metrics collector, is because it has the same access to script processes as script server. And there are no much reasons to couple them
well for that exactly pushgateway exists IMHO so you don't need to
On Mon, Mar 25, 2019 at 2:54 PM Iaroslav Shepilov notifications@github.com wrote:
Hi @yosefy https://github.com/yosefy, thanks for the clarification! Actually I'm thinking about an exporter, which will collect data on scripts itself, and script server won't be involved into this process. So the scripts are the "software that don't expose metrics" in this case. The reason, why I'm thinking about completely separated metrics collector, is because it has the same access to script internals as script
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bugy/script-server/issues/200#issuecomment-476183756, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVy7Qs9XUG7RxYyxw4sQXvdiZAxVbeVks5vaMbygaJpZM4cAj5e .
@bugy, @yosefy: thanks a lot. Your comments clarify things.
@bugy - you said:
Actually I'm thinking about an exporter, which will collect data on scripts itself, and script server won't be involved into this process. So the scripts are the "software that don't expose metrics" in this case.
That makes sense to me. I'll do some prototyping and report back for reference. All the best!
GC
Hi @gcmroi, I believe you would need some kind of trigger from the script server, once any script is started. This trigger (either HTTP or some process invocation) should be quite simple to implement. If you will need it, just say :)
@bugy: yes, that would be great! What would the most appropriate place be for that trigger in the script-server app, in your opinion? HHTP would seem the most straightforward approach.
Thanks!
Hi, I think it could be a part of script server configuration. For example:
...
"callbacks": {
"notifyOnStart": true,
"notifyOnFinish": true,
"notificationParameters": ["pid", "execution_id", "script_name", "user"],
"destinations": [
{
"type": "http"
"url": "https://my_server.com/executions-listeners"
},
{
"type": "script"
"script": "echo"
}
],
}
...
Thanks @bugy!
Done. @gcmroi, could you check if it's working for you, please?
notifyOnStart
and notifyOnFinish
are optional, default: true
notificationParameters
is optional, default: ['execution_id', 'pid', 'script_name', 'user', 'exit_code'].
Each notification will additionally have event_type
field, wich can be either 'execution_started' or 'execution_finished'
Specific details about destination types:
field: value
Below is my test configuration:
"callbacks": {
"notifyOnStart": true,
"notifyOnFinish": true,
"notificationParameters": ["pid", "execution_id", "script_name", "user"],
"destinations": [
{
"type": "email",
"from": "buggygm@gmail.com",
"to": "buggygm@gmail.com",
"server": "smtp.gmail.com",
"password": "$$EMAIL_PWD"
},
{
"type": "http",
"url": "localhost:5000/test_alerts"
},
{
"type": "script",
"command": "python3 ./samples/scripts/callback_test.py"
}
]
}
@bugy: I'll try it in the next day or so and report back. Thanks!
Are there any plans to add script monitoring to the server?
At the moment I use Datadog to monitor individual scripts, but it would be awesome to have some layer of monitoring integrated in the script-server.