wazuh / wazuh-agent

The Wazuh agent for endpoints.
https://wazuh.com
GNU Affero General Public License v3.0
32 stars 19 forks source link

Events generated by agent must comply with the common schema #253

Open sdvendramini opened 4 weeks ago

sdvendramini commented 4 weeks ago

Parent Issue: https://github.com/wazuh/wazuh-agent/issues/241

Description

The events generated by the agent must adhere to the common schema for consistency and compatibility across systems.

Details

Format body stateless and stateful

Stateless ```json { "agent": { "id": "2887e1cf-9bf2-431a-b066-a46860080f56", "name": "agent1", "type": "endpoint", "version": "5.0.0", "groups": ["group1", "group2"], "host": { "hostname": "myhost", "os": { "name": "Amazon Linux 2" "platform": "Linux" }, "ip": ["192.168.1.2"], "architecture": "x86_64" } } } { "module": "logcollector", "type": "file" } { "log": { "file": { "path": "string" } }, "tags": ["string"], "event": { "original": "string", "ingested": "string", "module": "string", "provider": "string" } } { "module": "inventory", "type": "package" } { "log": { "file": { "path": "string" } }, "tags": ["string"], "event": { "original": "string", "ingested": "string", "module": "string", "provider": "string" } } ```
Stateful ```json { "agent": { "id": "2887e1cf-9bf2-431a-b066-a46860080f56", "name": "agent1", "type": "endpoint", "version": "5.0.0", "groups": ["group1", "group2"], "host": { "hostname": "myhost", "os": { "name": "Amazon Linux 2" "platform": "Linux" }, "ip": ["192.168.1.2"], "architecture": "x86_64" } } } { "module": "inventory", "type": "package", "operation": "modified", "id": "lskdjf023984902358" } { "scan_time": "2024-10-28T18:26:10.634Z", "package": { "architecture": "string", "description": "string", "installed": "2024-10-28T18:26:10.634Z", "name": "string", "path": "string", "size": 0, "type": "string", "version": "string" } } { "module": "inventory", "type": "network", "operation": "add", "id": "lskdjf023984902358" } { "scan_time": "2024-10-28T18:26:10.634Z", "package": { "architecture": "string", "description": "string", "installed": "2024-10-28T18:26:10.634Z", "name": "string",{ "agent": { "uuid": "UUID", "groups": [ ], "os": "Amazon Linux 2", "platform": "Linux", "type": "Endpoint", "version": "5.0.0", "ip": "192.168.1.2" } } "path": "string", "size": 0, "type": "string", "version": "string" } } { "module": "inventory", "type": "network", "operation": "delete", "id": "asdfsdfkdsj98237498325" } ```

Tasks

LucioDonda commented 3 weeks ago

Hi @sdvendramini While I'm looking for them: Have you detected which fields or in which situation did the agent generate any non-ECS compliant event field? Where they part of any particular module ? TIA

GGP1 commented 3 weeks ago

@LucioDonda the ECS templates have been modified recently, it is highly likely that the agent is generating events with an outdated format.

I've been working on Update stateful events data models #26568 which covers the same case but for the Communications API POST /events/stateful endpoint.

Here are some of the structures we are accepting in JSON format.

FIM ```json { "agent": { "id": "string", "groups": [] }, "file": { "attributes": [ "string" ], "name": "string", "path": "string", "gid": 0, "group": "string", "inode": "string", "mtime": "2024-10-28T18:26:10.634Z", "mode": "string", "size": 0, "target_path": "string", "type": "string", "uid": 0, "owner": "string", "hash": { "md5": "string", "sha1": "string", "sha256": "string" } }, "registry": { "key": "string", "value": "string" } } ```
Inventory package ```json { "agent": { "id": "string", "groups": [] }, "scan_time": "2024-10-28T18:26:10.634Z", "package": { "architecture": "string", "description": "string", "installed": "2024-10-28T18:26:10.634Z", "name": "string", "path": "string", "size": 0, "type": "string", "version": "string" } } ```
Inventory processes ```json { "agent": { "id": "string", "groups": [] }, "scan_time": "2024-10-28T18:26:10.634Z", "process": { "pid": 0, "name": "string", "parent": { "pid": 0 }, "command_line": "string", "args": [ "string" ], "user": { "id": "string" }, "real_user": { "id": "string" }, "saved_user": { "id": "string" }, "group": { "id": "string" }, "real_group": { "id": "string" }, "saved_group": { "id": "string" }, "start": "2024-10-28T18:26:10.635Z", "thread": { "id": "string" } } } ```
Inventory system ```json { "agent": { "id": "string", "groups": [] }, "scan_time": "2024-10-28T18:26:10.635Z", "host": { "architecture": "string", "hostname": "string", "os": { "kernel": "string", "full": "string", "name": "string", "platform": "string", "version": "string", "type": "string" } } } ```
Vulnerability ```json { "agent": { "id": "string", "groups": [] "name": "string", "type": "string", "version": "string" }, "host": { "os": { "kernel": "string", "full": "string", "name": "string", "platform": "string", "version": "string", "type": "string" } }, "package": { "architecture": "string", "build_version": "string", "checksum": "string", "description": "string", "install_scope": "string", "installed": "2024-10-28T18:26:10.635Z", "license": "string", "name": "string", "path": "string", "reference": "string", "size": 0, "type": "string", "version": "string" }, "scanner": { "source": "string", "vendor": "string" }, "score": { "base": 0, "environmental": 0, "temporal": 0, "version": "string" }, "category": "string", "classification": "string", "description": "string", "detected_at": "2024-10-28T18:26:10.635Z", "enumeration": "string", "id": "string", "published_at": "2024-10-28T18:26:10.635Z", "reference": "string", "report_id": "string", "severity": "string", "under_evaluation": true } ```
Command result ```json { "document_id": "string", "result": { "code": "string", "message": "string", "data": "string" } } ```

At the same time, those objects have to be inside the data field of a wrapper object that also includes a module field. For example, an inventory package event would look like this:

{
  "data": {
    "agent": {
      "id": "string",
      "groups": []
    },
    "scan_time": "2024-10-28T18:26:10.634Z",
    "package": {
      "architecture": "string",
      "description": "string",
      "installed": "2024-10-28T18:26:10.634Z",
      "name": "string",
      "path": "string",
      "size": 0,
      "type": "string",
      "version": "string"
    }
  },
  "module": "inventory_package"
}

If you have any doubts or comments, we can arrange a meeting to discuss this further.

vikman90 commented 3 weeks ago

Module format

Examples

{
  "module": {
    "name": "inventory",
    "type": "package"
  },
  "data": { ... }
}

{
  "module": {
    "name": "vulnerability"
  },
  "data": { ... }
}

{
  "module": {
    "name": "data"
  },
  "data": { ... } 
}
vikman90 commented 3 weeks ago

Stateless: Logcollector


{
  "module": { "name": "logcollector" },
  "data": {
    "file": { "path": "/var/log/syslog" },
    "event": { "original": "2024-10-31T16:21:25.198579+01:00 Rocket systemd-resolved[176]: Clock change detected. Flushing caches." }
  }
}
cborla commented 3 weeks ago

Inventory analysis

The following analysis is based on the following sources.

The indexer (master) currently supports the following data structures for inventory.

@dataclass
class OS:
    """OS data model."""
    kernel: str
    full: str
    name: str
    platform: str
    version: str
    type: str
    family: str

@dataclass
class Host:
    """Host data model."""
    architecture: str
    hostname: str
    os: OS

@dataclass
class ProcessHash:
    md5: str

@dataclass
class Process:
    """Process data model."""
    hash: ProcessHash

@dataclass
class InventoryEvent(BaseModel):
    """Inventory events data model."""
    host: Host
    process: Process

    def get_index_name(self) -> str:
        """Get the index name for the event type.

        Returns
        -------
        str
            Index name.
        """
        return INVENTORY_INDEX

From the above classes of the Inventory Stateful event, we can obtain the following diagram.

InventoryEvent
│
├── Host
│   ├── architecture : str
│   ├── hostname : str
│   └── os : OS
│       ├── kernel : str
│       ├── full : str
│       ├── name : str
│       ├── platform : str
│       ├── version : str
│       ├── type : str
│       └── family : str
│
└── Process
    └── hash : ProcessHash
        └── md5 : str

There is a very big difference in the amount of data and data structures being shared. Currently the inventory gets 9 types of structures, with their corresponding information.

As a first development, it can be adapted to the structure model proposed by the indexer.

cborla commented 3 weeks ago

Inventory analysis

New sources.

cborla commented 3 weeks ago

Update 1/11

{
  "module": {
    "name": "inventory",
    "type": "package"
  },
  "data": { ... }
}

{
  "module": {
    "name": "vulnerability"
  },
  "data": { ... }
}
cborla commented 3 weeks ago

Stateful: Inventory

The agent is currently sending inventory messages in the following format. It remains to adapt the fields according to ECS, but within the format the operation to be carried out must be included.

{
    "data":
    {
        "argvs": null,
        "checksum": "ab94278230d240b66082ba6cbf52106cebff41ac",
        "cmd": null,
        "egroup": "root",
        "euser": "root",
        "fgroup": "root",
        "name": "kworker/u9:0-tt",
        "nice": -20,
        "nlwp": 1,
        "pgrp": 0,
        "pid": "86",
        "ppid": 2,
        "priority": 0,
        "processor": 2,
        "resident": 0,
        "rgroup": "root",
        "ruser": "root",
        "scan_time": "2024/11/02 01:55:47",
        "session": 0,
        "sgroup": "root",
        "share": 0,
        "size": 0,
        "start_time": 1730351047,
        "state": "I",
        "stime": 0,
        "suser": "root",
        "tgid": 86,
        "tty": 0,
        "utime": 0,
        "vm_size": 0
    },
    "operation": "DELETED",
    "type": "dbsync_processes"
}
cborla commented 3 weeks ago

Agent meta data

{
  "agent": {
    "uuid": "UUID", 
    "groups": [ ], 
    "os": "Amazon Linux 2", 
    "platform": "Linux", 
    "type": "Endpoint", 
    "version": "5.0.0", 
    "ip": "192.168.1.2" 
  }
}
{
  "module": "logcollector",
  "type": "file"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "base": {
    "tags": "string"
  },
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}
{
  "module": "inventory",
  "type": "package"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "base": {
    "tags": "string"
  },
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}

Stateful

{ 
  "agent": { 
    "uuid": "UUID", 
    "groups": [ ], 
    "os": "Amazon Linux 2", 
    "platform": "Linux", 
    "type": "Endpoint", 
    "version": "5.0.0", 
    "ip": "192.168.1.2" } 
}
{
  "module": "inventory",
  "type": "package",
  "operation": "modified",
  "id": "lskdjf023984902358"
}
{
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
    "architecture": "string",
    "description": "string",
    "installed": "2024-10-28T18:26:10.634Z",
    "name": "string",
    "path": "string",
    "size": 0,
    "type": "string",
    "version": "string"
  }
}
{
  "module": "inventory",
  "type": "network",
  "operation": "add",
  "id": "lskdjf023984902358"
}
{
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
    "architecture": "string",
    "description": "string",
    "installed": "2024-10-28T18:26:10.634Z",
    "name": "string",
    "path": "string",
    "size": 0,
    "type": "string",
    "version": "string"
  }
}
{
  "module": "inventory",
  "type": "network",
  "operation": "delete",
  "id": "asdfsdfkdsj98237498325"
}
cborla commented 2 weeks ago

A new column is added to the queue, to store the module metadata.

New queue structure: module_name module_type metadata data

This will allow the module data to be included in the new object, and will allow the data pair to be shared on a per-event basis.

{
  "module": "logcollector",
  "type": "file"
}

and

{
  "module": "inventory",
  "type": "network",
  "operation": "delete",
  "id": "asdfsdfkdsj98237498325"
}
cborla commented 2 weeks ago

Update 6/11

cborla commented 2 weeks ago

Update 7/11

cborla commented 2 weeks ago

Update 8/11

Stateless ```json { "module": "logcollector", "type": "file" } { "log": { "file": { "path": "string" } }, "tags": ["string"], "event": { "original": "string", "ingested": "string", "module": "string", "provider": "string" } } ```
Example ```json { "module": "logcollector", "type": "file" } { "event": { "ingested": "", "module": "logcollector", "original": "Testing message!", "provider": "syslog" }, "log": { "file": { "path": "/tmp/test.log" } }, "tags": [ "mvp" ] } ```

Update 11/11

Example event collected with mock server. ```request [2024-11-12 02:25:06] POST /api/v1/events/stateless Headers: Host: localhost User-Agent: WazuhXDR/5.0.0 (Endpoint; x86_64; Linux) Accept: application/json Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJ3YXp1aCIsImF1ZCI 6IldhenVoIENvbW11bmljYXRpb25zIEFQSSIsImlhdCI6MTczMTM3ODI5NCwiZXhwIjoxNzMxMzc4MzU0LCJ1d WlkIjoiZWRhYjllZjYtZjAyZC00YTRiLWJhYTQtZjJhZDEyNzg5ODkwIn0.aiAqjq2Nm9giF7jKGz8L8rsA1JX b5L25rNuKUZvwLAg Content-Type: application/json Content-Length: 404 Body: {"agent":{"groups":[],"host":{"architecture":"x86_64","hostname":"chb-VBox","ip":"10.0.2.5","os":{"name":"Ubuntu","platform":"Linux"}},"id":"ee9009ba-f2db-4ac4-a74f-77f52c2d421a","type":"Endpoint","version":"5.0.0"}} {"module":"logcollector","type":"file"} {"event":{"ingested":"","module":"logcollector","original":"hola wazuh","provider":"syslog"},"log":{"file":{"path":"/tmp/test.log"}},"tags":["mvp"]} ```

Update 12/11