elastic / security-docs

Elastic Security Documentation
Other
61 stars 177 forks source link

[Enhancement]: Document the endpoint "top" command #5097

Open nfritts opened 4 months ago

nfritts commented 4 months ago

Description

We've added command line context/help for the top command in 8.12 but think the documentation should also be in the official docs.

Related links / assets

Please include each of the following, if applicable: Doc URL: Subject matter expert: Figma link(s): Github epic link(s): Github issue link(s): https://github.com/elastic/endpoint-dev/issues/13771

Which documentation set needs improvement?

ESS and serverless

Software version

8.12+

Collaborators

PM: Designer: Developer: Others (if applicable):

Timeline / deliverables

No specific timeline really.

intxgo commented 1 month ago

Below is output of 8.15.0 help command, and my draft of the documentation page. I kept the format of Elastic Agent command reference.

Feel free to ask for any clarification. The documentation needs to be slightly updated for backports if we want to do it as we've been changing a bit the commands and their output over time.

=============================

C:\Users\le>"\Program Files\Elastic\Endpoint\elastic-endpoint.exe" help
Elastic Endpoint 8.15.0-SNAPSHOT
www.elastic.co

Usage: elastic-endpoint.exe command [sub-command] [common options]

commands:
    version
        description: print version details

    install
        --resources          install resources zip
        --upgrade            upgrade existing installation
        description: install Elastic Endpoint

    uninstall
        --uninstall-token    tamper protection token
        description: uninstall Elastic Endpoint

    run
        description: run elastic-endpoint.exe as foreground process
                     if no other instance is already running

    memorydump
        --compress           compress saved memory dump
        --timeout            default 60 seconds
        description: collect Endpoint's service memory dump

    send
        metadata             send off-schedule metrics
        description: send requested document to the stack

    inspect
        description: inspect endpoint configuration

    status
        --output             output format
                             [human, full, json]
        description: endpoint status

    test
        output               test if Endpoint can connect to remote resources
        description: perform requested test

    diagnostics
        description: collect diagnostics

    top
        --interval           data collection interval, default 5 sec
        --limit              collect given number of updates
        --normalized         normalize values to 100% on multi-CPU systems
        description: top shows a breakdown of the executables that triggered
                     Endpoint CPU usage within the last interval, multiple
                     instances are grouped together
        column abbreviations:
            MLWR       Malware Protection
            NET        Network Events
            PROC       Process Events
            FILE       File Events
            REG        Registry Events
            DNS        DNS Events
            LIB        Library Load Events
            AUTH       Authentication Events
            CRED       Credential Access Events
            RANSOM     Ransomware Protection
            API        ETW API Events
            PROC INJ   Process Injection
            MEM SCAN   Memory Scanning
            BHVR       Malicious Behavior Protection
            DIAG BHVR  Diagnostic Malicious Behavior Protection

common options:
    --log                enable log output
                         [stdout,stderr,debugview,file]
    --log-level          logging level
                         [error,info,debug]

examples:
    elastic-endpoint.exe top --limit 6
    elastic-endpoint.exe diagnostics --log stdout

=============================

Elastic Endpoint command reference

Elastic Endpoint, part of Elastic Defend integration, provides commands for management and troubleshooting. The service is not added to the system PATH variable, so the full OS dependent path have to be used

Restrictions

* Commands have to be run from elevated command prompt (root on POSIX, Administrator on Windows)

Each of the commands accepts logging options

--log [stdout,stderr,debugview,file]

--log-level [error,info,debug]

elastic-endpoint diagnostic

Gather diagnostics information from the Elastic Endpoint. This command produces an archive that contains:

elastic-endpoint help

Show help about available commands

elastic-endpoint inspect

Show the current Elastic Endpoint configuration.

elastic-endpoint install

Install Elastic Endpoint as a system service.

Note: Elastic doesn't publish independent Elastic Endpoint packages as Elastic Endpoint is managed by Elastic Agent.

Options

--resources install resources zip

--upgrade upgrade existing installation

elastic-endpoint memorydump

Saves a memory dump of Elastic Endpoint service.

Options

--compress compress saved memory dump

--timeout memory collecting timeout, default 60 seconds

elastic-endpoint run

Run elastic-endpoint as foreground process if no other instance is already running.

elastic-endpoint status

Returns the current status of the running Elastic Endpoint service. The last known status of the Elastic Agent is also returned.

Options

--output The output option controls the level of detail and formatting of the information. human returns limited information when Elastic Endpoint is in the HEALTHY state. If any policy action did not apply successfully the details are displayed. full and json always return the full status information.

elastic-endpoint send

This command has currently only one subcommand elastic-endpoint send metadata to send off-schedule metrics document to the stack

elastic-endpoint test

This command has currently only one subcommand elastic-endpoint test output to test if Endpoint can connect to remote resources

Example:

Testing output connections using config file: [C:\Program Files\Elastic\Endpoint\elastic-endpoint.yaml]

Using proxy:

Elasticsearch server: https://example.elastic.co:443
        Status: Success

Global artifact server: https://artifacts.security.elastic.co
        Status: Success

Fleet server: https://fleet.example.elastic.co:443
        Status: Success

elastic-endpoint top

Top shows a breakdown of the executables that triggered Endpoint CPU usage within the last interval. This utility visualize which Endpoint features are expensive for a particular executable.

Note: The meaning and output is similar to POSIX top command, but it's not equivalent. As noted above multiple processes are aggregated together by executable, further the utilization values are not measured by OS scheduler but a wall clock in user mode. The output helps finding outliers causing excessive CPU utilization to fine tune Elastic Defend policy and exception lists in your deployment.

Example:

| PROCESS                                            | OVERALL | API | BHVR | DIAG BHVR | DNS | FILE   | LIB | MEM SCAN | MLWR  | NET | PROC | RANSOM | REG |
=============================================================================================================================================================
| MSBuild.exe                                        |  3146.0 | 0.0 |  0.8 |       0.7 | 0.0 | 2330.9 | 0.0 |    226.2 | 586.9 | 0.0 |  0.0 |    0.4 | 0.0 |
| Microsoft.Management.Services.IntuneWindowsAgen... |    30.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.2 |     29.8 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| svchost.exe                                        |    27.3 | 0.0 |  0.1 |       0.1 | 0.0 |    0.4 | 0.2 |      0.0 |  26.6 | 0.0 |  0.0 |    0.0 | 0.0 |
| LenovoVantage-(LenovoServiceBridgeAddin).exe       |     0.1 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.1 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| Lenovo.Modern.ImController.PluginHost.Device.exe   |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| msedgewebview2.exe                                 |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| msedge.exe                                         |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| powershell.exe                                     |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| WmiPrvSE.exe                                       |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| Lenovo.Modern.ImController.PluginHost.Device.exe   |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| Slack.exe                                          |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| uhssvc.exe                                         |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| explorer.exe                                       |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| taskhostw.exe                                      |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| Widgets.exe                                        |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| elastic-endpoint.exe                               |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |
| sppsvc.exe                                         |     0.0 | 0.0 |  0.0 |       0.0 | 0.0 |    0.0 | 0.0 |      0.0 |   0.0 | 0.0 |  0.0 |    0.0 | 0.0 |

Endpoint service (16 CPU): 113.0% out of 1600%

Collecting data.  Press Ctrl-C to cancel

column abbreviations:

Options

--interval data collection interval, default 5 sec

--limit collect given number of updates, default collect until interrupted by Ctrl+C

--normalized normalize values to 100% on multi-CPU systems

elastic-endpoint uninstall

Uninstall Elastic Endpoint.

Note: Elastic Endpoint is managed by Elastic Agent. To remove it from the target machine permanently remove Elastic Defend integration from Fleet policy. elastic-agent uninstall also uninstalls Elastic Endpoint, therefore in practice this command is used only to troubleshoot broken installations.

Options

--uninstall-token uninstall token, required if Tamper Protection is activated.

elastic-endpoint version

Show the version of Elastic Endpoint

intxgo commented 1 month ago

Hi @natasha-moore-elastic I've just added inspect and status commands, which we'll ship in 8.15.0

natasha-moore-elastic commented 1 month ago

This is great, thank you @intxgo for putting together the draft! I'm hoping to start working on this in the next couple of weeks, so I'll reach out once I have a PR ready for review.

jmikell821 commented 1 month ago

Hello @nfritts and @intxgo 👋 We're trying to figure out which commands to backport to which version. I see that these commands are being added for 8.15.0. Can you tell us which commands were added in 8.12.0 and which ones were added in the releases after that? We don't want to add commands that don't apply to a specific release. Thanks for your help!

intxgo commented 1 month ago

Ok, I'll compile a list of commands per each endpoint version. Some commands (actually the top command only comes to my mind) were undergoing functional changes from version to version, so the description will also have to be altered.

intxgo commented 1 month ago

I've checked 7.17 and all 8.x versions, most commands were there all the time, some were introduced later, yet some options were added later to previous commands

Here's the historic diff to the current state described above:

commands introduced

    diagnostics
        description: collect diagnostics
    send
        metadata             send off-schedule metrics
        description: send requested document to the stack

options introduced

top output changes

in 8.9

    column abbreviations:
        MLWR       Malware Protection
        NET        Network Events
        PROC       Process Events
        FILE       File Events
        REG        Registry Events
        DNS        DNS Events
        LIB        Library Load Events
        AUTH       Authentication Events
        CRED       Credential Access Events
        RANSOM     Ransomware Protection
        TI API     ETW Threat Intelligence Events
        KEYBD      ETW win32k API Events
        PROC INJ   Process Injection
        MEM SCAN   Memory Scanning
        BHVR       Malicious Behavior Protection
        DIAG BHVR  Diagnostic Malicious Behavior Protection

in 8.10

    column abbreviations:
        MLWR       Malware Protection
        NET        Network Events
        PROC       Process Events
        FILE       File Events
        REG        Registry Events
        DNS        DNS Events
        LIB        Library Load Events
        AUTH       Authentication Events
        CRED       Credential Access Events
        RANSOM     Ransomware Protection
        TI API     ETW Threat Intelligence Events
        UI API     ETW win32k API Events
        PROC INJ   Process Injection
        MEM SCAN   Memory Scanning
        BHVR       Malicious Behavior Protection
        DIAG BHVR  Diagnostic Malicious Behavior Protection

in 8.13

    column abbreviations:
        MLWR       Malware Protection
        NET        Network Events
        PROC       Process Events
        FILE       File Events
        REG        Registry Events
        DNS        DNS Events
        LIB        Library Load Events
        AUTH       Authentication Events
        CRED       Credential Access Events
        RANSOM     Ransomware Protection
        API        ETW API Events
        PROC INJ   Process Injection
        MEM SCAN   Memory Scanning
        BHVR       Malicious Behavior Protection
        DIAG BHVR  Diagnostic Malicious Behavior Protection

The example screenshots for top output would also have to be re-captured on earlier versions then 8.12 as there are slight UI differences, but I suggest to just skip them in previous versions (treat it as "beta" feature, or not include the screenshot in any documentation, but maybe link to our engineering article in a NOTE box or something https://github.com/elastic/endpoint/blob/main/EndpointTopCommand.md this article not only includes the screenshot but actually explains how to consume the presented UI)

natasha-moore-elastic commented 1 month ago

Many thanks for this detailed breakdown, @intxgo! It sounds like we should backport this documentation all the way to 7.17 docs, is that correct?

intxgo commented 1 month ago

yes, that's correct.

intxgo commented 1 month ago

I've noticed that other integrations use the term "experimental" when they release a beta feature, so I suggest to document the top command as experimental in 8.9,8.10,8.11