elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.15k stars 4.91k forks source link

Metricbeat - Monitor Stopped Processes #11404

Open Gunnerva opened 5 years ago

Gunnerva commented 5 years ago

Describe the enhancement: Add the ability to report if a process is not running in addition to its current process state. Possibly add an additional flag so that in additional to returning running or sleeping it will also report "not running" or "stopped" if the process is not part of the process list.

Describe a specific use case for the enhancement or feature: The ability to report if a process is not running regardless of why it stopped, when it stopped and if it never started. This can be used for alerting on processes that can't use heartbeat.

Example system.yml:

In the above scenario metricbeat would check every 10s and report the process state if it is running or sleeping, but it would also report that the process is not running if Metricbeat cannot find it in the running process list.

beyondmars3 commented 5 years ago

I do this but it not work.

- module: system
  period: 5s
  metricsets:
    - cpu
    # - load
    - memory
    - network
    - process
    - process_summary
    - socket_summary
    #- entropy
    #- core
    - diskio
    # - socket
    - report-not-running
  # process.include_top_n:
  #   by_cpu: 5      # include top 5 processes by CPU
  #   by_memory: 5   # include top 5 processes by memory
  cpu.metrics: [percentages, normalized_percentages, ticks]
  processes: ['mysql*']
2019-08-15T19:21:27.897+0800    ERROR   instance/beat.go:877    Exiting: 1 error: metricset 'system/report-not-running' not found
simioa commented 4 years ago

@ruflin I'm interested in implementing this, can I give it a try? I thought about creating a new system.process_state metricset where one can define multiple processes to check. Configuration could look like:

module: system
metricsets:
  - process_state
process_state.processes:
  - cmdline: ".*metricbeat.*" # regex that matches cmdline
    alias: "metricbeat" # alias assigned to check for easier identification

Output could then look like:

    "system": {
        "process_state": {
            "process": {
                "name": "metricbeat",
                "command_line": "/usr/share/metricbeat/bin/metricbeat -c metricbeat.yml"
            },
            "cmdline": ".*metricbeat.*",
            "alias": "metricbeat",
            "running": true
        }
    }

where the process object containing information about the matched process is only added when "running" is true

TheSecMaven commented 4 years ago

this would be very valuable and much needed. we often try to determine if a process is hung with metricbeat which is one of the big use cases

ELKezdtem commented 4 years ago

+1 definitely need this feature

fearful-symmetry commented 3 years ago

This is definitely doable, it's just a matter of where we want it in the beats ecosystem, since it's a bit conceptually unusual compared to all the rest of metricbeat.

simioa commented 3 years ago

I had the same thought. I think it is most likely better placed inside heartbeat in context of service monitoring.

Right now, if we want to monitor such processes with the Beats ecosystem, we have to deal with workarounds that are not easy to maintain or not failproof, this would be a very valuable feature for Users who are using Beats and trying to monitor Services that are not exposing network ports.

botelastic[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

willemdh commented 2 years ago

Ping

botelastic[bot] commented 1 year ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

willemdh commented 1 year ago

Ping

ELKezdtem commented 1 year ago

+1

willemdh @.***> ezt írta (időpont: 2023. jan. 19., Cs, 21:59):

Ping

— Reply to this email directly, view it on GitHub https://github.com/elastic/beats/issues/11404#issuecomment-1397604010, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQS4VN6CBKJOSQLQWYBJRODWTGTLFANCNFSM4HARYAKQ . You are receiving this because you commented.Message ID: @.***>

botelastic[bot] commented 8 months ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

LBoraz commented 6 months ago

congratulations on ignoring this for 4 years