elastic / cloudbeat

Analyzing Cloud Security Posture
Other
13 stars 42 forks source link

[GCP] Handle empty cloud.account.name and cloud.account.id fields on CSPM GCP findings #2053

Closed opauloh closed 5 months ago

opauloh commented 7 months ago

Motivation

While investigating this kibana issue, we found out the GCP Benchmark Rule 2.3 introduced findings data with empty cloud.account.name and cloud.account.id field, while it's expected behaviour to not have cloud.account.* data since the data is related to a misconfiguration at an organization level not attached to an account, it would make more sense if the field where missing rather than empty strings, as that makes difference on how the data is being handled.

Definition of done

Related tasks/epics

Screenshots

Image

Document example:

{
  "agent": {
    "name": "cspm-long-run-bc4",
    "id": "c2f88f91-6bdd-487f-9792-6448be8be538",
    "type": "cloudbeat",
    "ephemeral_id": "d0ff7adb-20ce-4ba3-b579-5278b8525ca8",
    "version": "8.13.0"
  },
  "resource": {
    "sub_type": "gcp-logging-log-bucket",
    "name": "organizations/693506308612/locations/global/buckets/_Required",
    "raw": {
      "AccessContextPolicy": null,
      "update_time": {
        "seconds": 1692104034,
        "nanos": 374657000
      },
      "resource": {
        "parent": "//cloudresourcemanager.googleapis.com/organizations/693506308612",
        "data": {
          "lifecycleState": "ACTIVE",
          "retentionDays": 400,
          "name": "organizations/693506308612/locations/global/buckets/_Required",
          "description": "Audit bucket",
          "locked": true
        },
        "location": "global",
        "discovery_name": "LogBucket",
        "version": "v2",
        "discovery_document_uri": "https://logging.googleapis.com/$discovery/rest"
      },
      "asset_type": "logging.googleapis.com/LogBucket",
      "name": "//logging.googleapis.com/organizations/693506308612/locations/global/buckets/_Required",
      "ancestors": [
        "organizations/693506308612"
      ]
    },
    "id": "//logging.googleapis.com/organizations/693506308612/locations/global/buckets/_Required",
    "type": "cloud-storage",
    "region": "global"
  },
  "cloud_security_posture": {
    "package_policy": {
      "id": "4fc67339-4840-404e-8150-419ef511ab53",
      "revision": 2
    }
  },
  "elastic_agent": {
    "id": "c2f88f91-6bdd-487f-9792-6448be8be538",
    "version": "8.13.0",
    "snapshot": false
  },
  "rule": {
    "references": "1. https://cloud.google.com/storage/docs/bucket-lock\n2. https://cloud.google.com/storage/docs/using-bucket-lock\n3. https://cloud.google.com/storage/docs/bucket-lock",
    "impact": "Locking a bucket is an irreversible action. Once you lock a bucket, you cannot remove the retention policy from the bucket or decrease the retention period for the policy. You will then have to wait for the retention period for all items within the bucket before you can delete them, and then the bucket.",
    "description": "Enabling retention policies on log buckets will protect logs stored in cloud storage buckets from being overwritten or accidentally deleted.\nIt is recommended to set up retention policies and configure Bucket Lock on all storage buckets that are used as log sinks.",
    "section": "Logging and Monitoring",
    "default_value": "",
    "version": "1.0",
    "rationale": "Logs can be exported by creating one or more sinks that include a log filter and a destination.\nAs Cloud Logging receives new log entries, they are compared against each sink.\nIf a log entry matches a sink's filter, then a copy of the log entry is written to the destination.\n\nSinks can be configured to export logs in storage buckets.\nIt is recommended to configure a data retention policy for these cloud storage buckets and to lock the data retention policy; thus permanently preventing the policy from being reduced or removed.\nThis way, if the system is ever compromised by an attacker or a malicious insider who wants to cover their tracks, the activity logs are definitely preserved for forensics and security investigations.",
    "benchmark": {
      "name": "CIS Google Cloud Platform Foundation",
      "rule_number": "2.3",
      "id": "cis_gcp",
      "version": "v2.0.0",
      "posture_type": "cspm"
    },
    "tags": [
      "CIS",
      "GCP",
      "CIS 2.3",
      "Logging and Monitoring"
    ],
    "remediation": "**From Google Cloud Console**\n\n1. If sinks are **not** configured, first follow the instructions in the recommendation: `Ensure that sinks are configured for all Log entries`.\n\n2. For each storage bucket configured as a sink, go to the Cloud Storage browser at `https://console.cloud.google.com/storage/browser/<BUCKET_NAME>`.\n\n3. Select the Bucket Lock tab near the top of the page.\n\n4. In the Retention policy entry, click the Add Duration link. The `Set a retention policy` dialog box appears.\n\n5. Enter the desired length of time for the retention period and click `Save policy`.\n\n6. Set the `Lock status` for this retention policy to `Locked`.\n\n**From Google Cloud CLI**\n\n7. To list all sinks destined to storage buckets:\n```\ngcloud logging sinks list --folder=FOLDER_ID | --organization=ORGANIZATION_ID | --project=PROJECT_ID\n```\n8. For each storage bucket listed above, set a retention policy and lock it:\n```\ngsutil retention set [TIME_DURATION] gs://[BUCKET_NAME]\ngsutil retention lock gs://[BUCKET_NAME]\n```\n\nFor more information, visit [https://cloud.google.com/storage/docs/using-bucket-lock#set-policy](https://cloud.google.com/storage/docs/using-bucket-lock#set-policy).",
    "audit": "**From Google Cloud Console**\n\n1. Open the Cloud Storage browser in the Google Cloud Console by visiting [https://console.cloud.google.com/storage/browser](https://console.cloud.google.com/storage/browser).\n\n2. In the Column display options menu, make sure `Retention policy` is checked.\n\n3. In the list of buckets, the retention period of each bucket is found in the `Retention policy` column. If the retention policy is locked, an image of a lock appears directly to the left of the retention period.\n\n**From Google Cloud CLI**\n\n4. To list all sinks destined to storage buckets:\n```\ngcloud logging sinks list --folder=FOLDER_ID | --organization=ORGANIZATION_ID | --project=PROJECT_ID\n```\n5. For every storage bucket listed above, verify that retention policies and Bucket Lock are enabled:\n```\ngsutil retention get gs://BUCKET_NAME\n```\n\nFor more information, see [https://cloud.google.com/storage/docs/using-bucket-lock#view-policy](https://cloud.google.com/storage/docs/using-bucket-lock#view-policy).",
    "name": "Ensure That Retention Policies on Cloud Storage Buckets Used for Exporting Logs Are Configured Using Bucket Lock",
    "id": "1e4f8b50-90e4-5e99-8a40-a21b142eb6b4",
    "profile_applicability": "* Level 2"
  },
  "message": "Rule \"Ensure That Retention Policies on Cloud Storage Buckets Used for Exporting Logs Are Configured Using Bucket Lock\": passed",
  "error": {
    "message": [
      "field [cluster_id] not present as part of path [cluster_id]"
    ]
  },
  "result": {
    "evaluation": "passed",
    "evidence": {
      "parent": "//cloudresourcemanager.googleapis.com/organizations/693506308612",
      "data": {
        "lifecycleState": "ACTIVE",
        "retentionDays": 400,
        "name": "organizations/693506308612/locations/global/buckets/_Required",
        "description": "Audit bucket",
        "locked": true
      },
      "location": "global",
      "discovery_name": "LogBucket",
      "version": "v2",
      "discovery_document_uri": "https://logging.googleapis.com/$discovery/rest"
    },
    "expected": null
  },
  "cloud": {
    "Organization": {
      "name": "cspelastic.com",
      "id": "693506308612"
    },
    "provider": "gcp",
    "account": {
      "name": "",
      "id": ""
    }
  },
  "@timestamp": "2024-03-22T18:55:52.104Z",
  "cloudbeat": {
    "commit_sha": "6d80dede46afaa7d1a6dc0241a4e52a28120d598",
    "commit_time": "2024-03-05T10:54:27Z",
    "version": "8.13.0",
    "policy": {
      "commit_sha": "6d80dede46afaa7d1a6dc0241a4e52a28120d598",
      "commit_time": "2024-03-05T10:54:27Z",
      "version": "8.13.0"
    }
  },
  "ecs": {
    "version": "8.6.0"
  },
  "data_stream": {
    "namespace": "default",
    "type": "logs",
    "dataset": "cloud_security_posture.findings"
  },
  "host": {
    "name": "cspm-long-run-bc4"
  },
  "event": {
    "agent_id_status": "auth_metadata_missing",
    "sequence": 1711133751,
    "ingested": "2024-03-22T18:59:48Z",
    "kind": "pipeline_error",
    "created": "2024-03-22T18:55:52.104107311Z",
    "id": "bcdcb7cd-eb2b-4cce-a4e5-efcf48c530b5",
    "type": [
      "info"
    ],
    "category": [
      "configuration"
    ],
    "dataset": "cloud_security_posture.findings",
    "outcome": "success"
  }
}
orouz commented 7 months ago

@opauloh should we send null when the values are missing or not send the fields at all?

opauloh commented 7 months ago

@opauloh should we send null when the values are missing or not send the fields at all?

@orouz It's better not to send the fields at all, also in favour of establishing a consistent behaviour with other fields (for example, Cloudbeat does not send cloud.Organization.id if it doesn't have organization information).

oren-zohar commented 7 months ago

When there's no cloud.account.id or cloud.account.name data, cloudbeat should send missing fields instead of empty.

This breaks the dashboards though, no?

it would make more sense if the field where missing rather than empty strings, as that makes difference on how the data is being handled.

In what way? What are we trying to fix here @opauloh?

opauloh commented 7 months ago

When there's no cloud.account.id or cloud.account.name data, cloudbeat should send missing fields instead of empty.

This breaks the dashboards though, no?

There is a work in progress to handle missing fields in Kibana for 8.14 +

opauloh commented 7 months ago

it would make more sense if the field where missing rather than empty strings, as that makes difference on how the data is being handled.

In what way? What are we trying to fix here @opauloh?

The presence of empty strings in these fields introduces ambiguity in what the data represents. To clarify, when a field is represented as an empty string, it implies that the field exists but holds no value (for example, having a cloud.account.name equals an empty string, leaves room to interpret that in the organization there is really a Cloud Account where it's name was set to an empty string). On the other hand, the absence of the field altogether indicates that the information is not applicable or available.

Therefore, as part of the solution, we propose that when there is no cloud.account.id or cloud.account.name data available, we suggest implementing this adjustment in Cloudbeat's data transmission logic to enhance the integrity and usability of the data generated by our native integration.

orouz commented 7 months ago

@opauloh

in aws / azure we do this: https://github.com/elastic/cloudbeat/blob/392f969a2e812617f4c5fdbcf748c9709a336d9b/internal/dataprovider/providers/cloud/data_provider.go#L63-L64

if GCP did the same, would it solve the issue? (if so, then this would be fixed in https://github.com/elastic/cloudbeat/pull/2085)

i'm a bit unsure because you say we shouldn't send empty fields at all, but that code is from a PR that fixes crashing dashboards due to lack of fields by sending empty string fields if we have no value to send (as in "")

in anyway, whatever we decide here: send fields without data as an empty string or not send those fields at all, should be done for all cloud vendors

opauloh commented 7 months ago

@opauloh

in aws / azure we do this:

https://github.com/elastic/cloudbeat/blob/392f969a2e812617f4c5fdbcf748c9709a336d9b/internal/dataprovider/providers/cloud/data_provider.go#L63-L64

if GCP did the same, would it solve the issue? (if so, then this would be fixed in #2085)

i'm a bit unsure because you say we shouldn't send empty fields at all, but that code is from a PR that fixes crashing dashboards due to lack of fields by sending empty string fields if we have no value to send (as in "")

Thanks for sharing the PR with the fix, that was true and necessary for Kibana on versions 8.13 and lower as we were not considering the use case of Findings with organization data that does not relate to a specific cloud account, we now fixed that for 8.14+ on this PR, so this behaviour can now be reverted to use insertIfNotEmpty.

in anyway, whatever we decide here: send fields without data as an empty string or not send those fields at all, should be done for all cloud vendors

I agree, all cloud vendors can use insertIfNotEmpty to ensure consistent behaviour.

opauloh commented 5 months ago

Closing as it was addressed here with ingest pipelines

amirbenun commented 3 months ago

Verified: Starts with GCP filter - have findings Search for cloud.account.name : "" - no findings

https://github.com/user-attachments/assets/9af4cfcb-5c01-4f4b-9fd7-38a9e0c7ef74