influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.68k stars 5.59k forks source link

[inputs.php-fpm] Adding full metrics #14421

Closed anthosz closed 11 months ago

anthosz commented 11 months ago

Use Case

Add support to per process monitoring (full status page) for php-fpm input plugin

It allow to get some metrics to be able to follow consumption of memory/cpu/etc by request/uri/script.

Expected behavior

    phpfpm:
        tags:
            pool
            url
            user
            request uri
            request method
            script
        fields:
            state
            start time (optionnal)
            start since (optionnal)
            requests
            request duration
            content length
            last request cpu
            last request memory

Actual behavior

phpfpm
    tags:
        pool
        url
    fields:
        accepted_conn
        listen_queue
        max_listen_queue
        listen_queue_len
        idle_processes
        active_processes
        total_processes
        max_active_processes
        max_children_reached
        slow_requests

Additional info

Full info in json mode can be found via the path 127.0.0.1/fpm-status?full&json

Related to previous closed ticket: https://github.com/influxdata/telegraf/issues/5737

powersj commented 11 months ago

Hi,

Is this something you plan on contributing?

If not, could you share what that URL looks like to aid in someone implementing this? We would need to consider if a different metric name also makes sense to distinguish the current metrics any new things we collect.

anthosz commented 11 months ago

HI @powersj,

Thank you, so fast!

I can help to test if needed but cannot really help on go :/

Here is a template of the output:

{
  "pool": "www",
  "process manager": "static",
  "start time": 1702044927,
  "start since": 4901,
  "accepted conn": 3879,
  "listen queue": 0,
  "max listen queue": 0,
  "listen queue len": 0,
  "idle processes": 9,
  "active processes": 1,
  "total processes": 10,
  "max active processes": 3,
  "max children reached": 0,
  "slow requests": 0,
  "processes": [
    {
      "pid": 583,
      "state": "Running",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 386,
      "request duration": 159,
      "request method": "GET",
      "request uri": "/fpm-status?json&full",
      "content length": 0,
      "user": "-",
      "script": "-",
      "last request cpu": 0,
      "last request memory": 0
    },
    {
      "pid": 584,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 390,
      "request duration": 174,
      "request method": "GET",
      "request uri": "/fpm-status",
      "content length": 0,
      "user": "-",
      "script": "-",
      "last request cpu": 0,
      "last request memory": 2097152
    },
    {
      "pid": 585,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 389,
      "request duration": 9530,
      "request method": "GET",
      "request uri": "/index.php",
      "content length": 0,
      "user": "-",
      "script": "script.php",
      "last request cpu": 104.93,
      "last request memory": 2097152
    },
    {
      "pid": 586,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 399,
      "request duration": 127,
      "request method": "GET",
      "request uri": "/ping",
      "content length": 0,
      "user": "-",
      "script": "-",
      "last request cpu": 0,
      "last request memory": 2097152
    },
    {
      "pid": 587,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 382,
      "request duration": 9713,
      "request method": "GET",
      "request uri": "/index.php",
      "content length": 0,
      "user": "-",
      "script": "script.php",
      "last request cpu": 0,
      "last request memory": 2097152
    },
    {
      "pid": 588,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 383,
      "request duration": 133,
      "request method": "GET",
      "request uri": "/ping",
      "content length": 0,
      "user": "-",
      "script": "-",
      "last request cpu": 0,
      "last request memory": 2097152
    },
    {
      "pid": 589,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 381,
      "request duration": 154,
      "request method": "GET",
      "request uri": "/fpm-status?json",
      "content length": 0,
      "user": "-",
      "script": "-",
      "last request cpu": 0,
      "last request memory": 2097152
    },
    {
      "pid": 590,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 397,
      "request duration": 108,
      "request method": "GET",
      "request uri": "/ping",
      "content length": 0,
      "user": "-",
      "script": "-",
      "last request cpu": 0,
      "last request memory": 2097152
    },
    {
      "pid": 591,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 381,
      "request duration": 9068,
      "request method": "GET",
      "request uri": "/index.php",
      "content length": 0,
      "user": "-",
      "script": "script.php",
      "last request cpu": 110.28,
      "last request memory": 2097152
    },
    {
      "pid": 592,
      "state": "Idle",
      "start time": 1702044927,
      "start since": 4901,
      "requests": 391,
      "request duration": 15559,
      "request method": "GET",
      "request uri": "/index.php",
      "content length": 0,
      "user": "-",
      "script": "script.php",
      "last request cpu": 64.27,
      "last request memory": 2097152
    }
  ]
}

For information, just need to do a GET on the fpm status (path can change depending of configuration) via these parameters: http://127.0.0.1/fpm-status?json&full

powersj commented 11 months ago

@anthosz,

Does the current plugin work with JSON output? It looks like we currently parse any output line by line.

Does 127.0.0.1/fpm-status?full produce the same data you provided, but not in JSON?

Can you give the artifacts in https://github.com/influxdata/telegraf/pull/14423 a try? I have added everything missing as a field for now.

Thanks!

anthosz commented 11 months ago

@powersj indeed, it use well line by line but I througt json was better to parse.

You can see here an example with the line by line (each process are separated by **** line):

pool:                 www
process manager:      static
start time:           08/Dec/2023:21:35:23 +0100
start since:          219269
accepted conn:        168126
listen queue:         0
max listen queue:     0
listen queue len:     0
idle processes:       9
active processes:     1
total processes:      10
max active processes: 3
max children reached: 0
slow requests:        0

************************
pid:                  438
state:                Idle
start time:           11/Dec/2023:09:27:39 +0100
start since:          3733
requests:             281
request duration:     162
request method:       GET
request URI:          /XXXXX
content length:       0
user:                 -
script:               -
last request cpu:     0.00
last request memory:  2097152

************************
pid:                  440
state:                Idle
start time:           11/Dec/2023:09:28:06 +0100
start since:          3706
requests:             288
request duration:     8843
request method:       GET
request URI:          /XXXXX
content length:       0
user:                 -
script:               /XXXX
last request cpu:     0.00
last request memory:  2097152

************************
pid:                  433
state:                Idle
start time:           11/Dec/2023:09:11:30 +0100
start since:          4702
requests:             358
request duration:     9137
request method:       GET
request URI:          /XXXX
content length:       0
user:                 -
script:               //XXXX
last request cpu:     0.00
last request memory:  2097152

************************
pid:                  441
state:                Idle
start time:           11/Dec/2023:09:29:18 +0100
start since:          3634
requests:             279
request duration:     134
request method:       GET
request URI:          /XXX
content length:       0
user:                 -
script:               -
last request cpu:     0.00
last request memory:  2097152

************************
[...]
powersj commented 11 months ago

@anthosz,

Were you able to try the artifacts from PR?

I agree that the JSON would be easier, but it looks like the current plugin expects the flat layout for now. We can look to change this down the road, but would need to ensure users understand that the URL needs to provide JSON or have a fallback mechanism.

anthosz commented 11 months ago

@powersj

Just tried with this config:

[[inputs.phpfpm]]
  urls = ["http://127.0.0.1/fpm-status?full"]

Results:

phpfpm,host=XXXX,pool=XX,url=http://127.0.0.1/status?full accepted_conn=17095863i,active_processes=27i,content_length=0i,idle_processes=17i,last_request_memory=2097152i,listen_queue=0i,listen_queue_len=
0i,max_active_processes=195i,max_children_reached=5i,max_listen_queue=0i,request_duration=65187i,request_method="-",requests=67i,script="-",slow_requests=2i,start_since=787i,state="Idle",total_processes=44i,user="-" 1702309953000000000

Seems not good :/

powersj commented 11 months ago

Ah thanks for the full output. I now see what is going on. I am going to have to rethink this a bit, since we will need new metric for each of these requested URIs/PIDs as this is a bit more than adding some missing fields.

powersj commented 11 months ago

ok new PR: #14421 if you could give that a shot in 20-30mins once new artifacts are attached. You will need to 1) use the JSON URL and 2) a new config option to specify the new metrics.

[[inputs.phpfpm]]
  urls = ["http://127.0.0.1/fpm-status?json&full"]
  format = "status"

Technically you could probably do this with the JSON parser or XPATH parsers... but this isn't too hard to add.

anthosz commented 11 months ago

@powersj no output anymore with:

[[inputs.phpfpm]]
  urls = ["http://127.0.0.1/fpm-status?json&full"]
  format = "status"

If I remove the format = "status" + json, I have the original output.

Telegraf 1.30.0-0c0ea62c (git: pull/14430@0c0ea62c)

powersj commented 11 months ago

Ugh my bad, it should be format = json obviously since you are getting JSON data ;)

anthosz commented 11 months ago

Ugh my bad, it should be format = json obviously since you are getting JSON data ;)

At the time I didn't even make the connection -_-

But yep, I confirm that all is good, thank you! Was so fast!

powersj commented 11 months ago

ok thanks! Let me clean up the readme and verify unit tests, and I'll get this ready for review.

Thank you for all the data and for trying that out so quickly as well!