elastic / integrations

Elastic Integrations
https://www.elastic.co/integrations
Other
197 stars 427 forks source link

[Microsoft 365] Pipeline failure resulting in incorrect root-level fields #9920

Open jvalente-salemstate opened 4 months ago

jvalente-salemstate commented 4 months ago

This looks like it was introduced in v2.0.0 or v2.1.0 as this only began showing in mid January. See #8571 and #8803 for the changes in the specific PR.

Almost entirely, this occurs to o365.audit.record.type: 64, for Automated Investigation & Response (AIR) events. When the pipeline fails, it does not continue to rename fields nested under o365audit to o365.audit and these events will not match under any queries for those fields.

Many events do end up parsing correctly, but I've got roundly 9200 since January 22nd with an error.message that initially contained:

Duplicate field 'QueryTime' at [Source: (org.elasticsearch.common.io.stream.ByteBufferStreamInput); line: x, column: y]

At some point after, this became:

//Duplicate field 'QueryTime' at [Source (String)"
{
  "Version": "3.0",
  "VendorName": "Microsoft",
  "ProviderName": "OATP",
  "AlertType": "8e6ba277-ef39-404e-aaf1-294f6d9a2b88",
  "StartTimeUtc": "2024-05-18T18:48:51.2937397Z",
  "EndTimeUtc": "2024-05-18T18:48:51.2937397Z",
  "TimeGenerated": "2024-05-18T18:51:31.44Z",
"ProcessingEndTime": "2024-05-18T19:18:50.6990119Z",
"Status": "Resolved",
  "DetectionTechnology": "UrlReputation",
"Severity":"Informational",
"ConfidenceLevel":"Unknown",
"ConfidenceScore":1.0,
"IsIncident":false,
"ProviderAlertId":"01234567-ab10-1337-b000-abcd..."

// [truncated 10638 chars]; line: 1, column: 4841]

Formatting edited for readability

The AIR events contain an array of entities related to an alert. An alert might have multiple instances of one entity type and it seems like most,if not all of these, are when the alert has more than one mail cluster that may be sender+IP+subject, sender+attachments, etc (with the ID being analogous for the fingerprint processor in elastic, based on those values). Since something like QueryTime has exists in each, the above error is thrown.

Sample data:

{
  "Version": "3.0",
  "VendorName": "Microsoft",
  "ProviderName": "OATP",
  "AlertType": "8e6ba277-ef39-404e-aaf1-294f6d9a2b88",
  "StartTimeUtc": "2024-05-18T18:19:54.5065341Z",
  "EndTimeUtc": "2024-05-18T18:19:54.5065341Z",
  "TimeGenerated": "2024-05-18T18:23:38.33Z",
  "ProcessingEndTime": "2024-05-18T18:30:52.6561751Z",
  "Status": "InProgress",
  "DetectionTechnology": "UrlReputation",
  "Severity": "Informational",
  "ConfidenceLevel": "Unknown",
  "ConfidenceScore": 1,
  "IsIncident": false,
  "ProviderAlertId": "u",
  "SystemAlertId": null,
  "CorrelationKey": "x",
  "Investigations": [
    {
      "$id": "1",
      "Id": "urn:ZappedUrlInvestigation:x",
      "InvestigationStatus": "Running"
    }
  ],
  "InvestigationIds": [
    "urn:ZappedUrlInvestigation:x"
  ],
  "Intent": "Probing",
  "ResourceIdentifiers": [
    {
      "$id": "2",
      "AadTenantId": "x",
      "Type": "AAD"
    }
  ],
  "AzureResourceId": null,
  "WorkspaceId": null,
  "WorkspaceSubscriptionId": null,
  "WorkspaceResourceGroup": null,
  "AgentId": null,
  "AlertDisplayName": "Email messages containing malicious URL removed after delivery​",
  "Description": "Emails with malicious URL that were delivered and later removed -V1.0.0.3",
  "ExtendedLinks": [
    {
      "Href": "https://security.microsoft.com/alerts/x",
      "Category": null,
      "Label": "alert",
      "Type": "webLink"
    }
  ],
  "Metadata": {
    "CustomApps": null,
    "GenericInfo": null
  },
  "Entities": [
    {
      "$id": "3",
      "Recipient": "x,
      "Urls": [
        "a",
        "a",
        "c",
        "d"
      ],
      "Threats": [
        "ZapPhish",
        "Spam",
        "HighConfPhish"
      ],
      "Sender": "y",
      "P1Sender": "x",
      "P1SenderDomain": "nx",
      "SenderIP": "x.x.x.6",
      "P2Sender": "",
      "P2SenderDisplayName": "blood pressure solution",
      "P2SenderDomain": "m",
      "ReceivedDate": "2024-05-17T16:16:53",
      "NetworkMessageId": "l",
      "InternetMessageId": "ol",
      "Subject": "External: 1 food that kills high blood pressure",
      "AntispamDirection": "Inbound",
      "DeliveryAction": "DeliveredAsSpam",
      "ThreatDetectionMethods": [
        "URLList"
      ],
      "Language": "en",
      "DeliveryLocation": "JunkFolder",
      "OriginalDeliveryLocation": "JunkFolder",
      "AdditionalActionsAndResults": [
        "OriginalDelivery: [N/A]"
      ],
      "AuthDetails": [
        {
          "Name": "SPF",
          "Value": "Pass"
        },
        {
          "Name": "DKIM",
          "Value": "Pass"
        },
        {
          "Name": "DMARC",
          "Value": "Pass"
        },
        {
          "Name": "Comp Auth",
          "Value": "pass"
        }
      ],
      "SystemOverrides": [],
      "Type": "mailMessage",
      "Urn": "urn:MailEntity:l",
      "Source": "OATP",
      "FirstSeen": "2024-05-18T18:25:40"
    },
    {
      "$id": "5",
      "MailboxPrimaryAddress": "ou",
      "Upn": "ou",
      "AadId": "7",
      "RiskLevel": "None",
      "Type": "mailbox",
      "Urn": "urn:UserEntity:l",
      "Source": "OATP",
      "FirstSeen": "2024-05-18T18:25:40"
    },
    {
      "$id": "7",
      "NetworkMessageIds": [
        "9a",
        "aa"
      ],
  // snip
      "Query": "( ((NormalizedUrl:\"l\") AND (ContentType: 1)) AND NOT(XmiInfoTenantPolicyFinalVerdictSource:PhishEdu) AND NOT(XmiInfoTenantPolicyFinalVerdictSource:SecOps))",
      "QueryTime": "5/18/2024 6:28:51 PM",
      "MailCount": 13,
      "IsVolumeAnamoly": false,
      "ClusterSourceIdentifier": "l",
      "ClusterSourceType": "UrlThreatIndicator",
      "ClusterQueryStartTime": "2024-04-28T00:00:00Z",
      "ClusterQueryEndTime": "2024-05-18T18:28:51.0798219Z",
      "ClusterGroup": "UrlThreatIdentifier",
      "Type": "mailCluster",
      "ClusterBy": "NormalizedUrl;ContentType",
      "ClusterByValue": "l;1",
      "QueryStartTime": "4/28/2024 12:00:00 AM",
      "Urn": "urn:MailClusterEntity:y",
      "Source": "OATP",
      "FirstSeen": "2024-05-18T18:28:53"
    },
    {
      "$id": "8",
      "NetworkMessageIds": [
        "l",
        "z"
      ],
    // snip
      "Query": "( (( (BodyFingerprintBin1:\"31545xxxxx\") ) AND ( (P2SenderDomain:\"l\") ) AND ( (ContentType: 1) )) AND NOT(XmiInfoTenantPolicyFinalVerdictSource:PhishEdu) AND NOT(XmiInfoTenantPolicyFinalVerdictSource:SecOps))",
      "QueryTime": "5/18/2024 6:28:55 PM",
      "MailCount": 7,
      "IsVolumeAnamoly": false,
      "ClusterSourceIdentifier": "",
      "ClusterSourceType": "Similarity",
      "ClusterQueryStartTime": "2024-04-28T00:00:00Z",
      "ClusterQueryEndTime": "2024-05-18T18:28:55.1110657Z",
      "ClusterGroup": "BodyFingerprintBin1,P2SenderDomain",
      "Type": "mailCluster",
      "ClusterBy": "BodyFingerprintBin1;P2SenderDomain;ContentType",
      "ClusterByValue": "l;l;1",
      "QueryStartTime": "4/28/2024 12:00:00 AM",
      "Urn": "urn:MailClusterEntity:l",
      "Source": "OATP",
      "FirstSeen": "2024-05-18T18:29:00"
    },
    {
      "$id": "12",
      "NetworkMessageIds": [
        "l3",
        "al"
      ],
      // snip
      "Query": "( ((NormalizedUrl:\"hl\") AND (ContentType: 1)) AND NOT(XmiInfoTenantPolicyFinalVerdictSource:PhishEdu) AND NOT(XmiInfoTenantPolicyFinalVerdictSource:SecOps))",
      "QueryTime": "5/18/2024 6:29:02 PM",
      "MailCount": 13,
      "IsVolumeAnamoly": false,
      "ClusterSourceIdentifier": "l",
      "ClusterSourceType": "UrlThreatIndicator",
      "ClusterQueryStartTime": "2024-04-28T00:00:00Z",
      "ClusterQueryEndTime": "2024-05-18T18:29:02.3454248Z",
      "ClusterGroup": "UrlThreatIdentifier",
      "Type": "mailCluster",
      "ClusterBy": "NormalizedUrl;ContentType",
      "ClusterByValue": "l;1",
      "QueryStartTime": "4/28/2024 12:00:00 AM",
      "Urn": "urn:MailClusterEntity:l",
      "Source": "OATP",
      "FirstSeen": "2024-05-18T18:29:05"
    }
  ],
  "LogCreationTime": "2024-05-18T18:30:52.6561751Z",
  "MachineName": "x",
  "SourceTemplateType": "Threat_Single",
  "Category": "ThreatManagement",
  "SourceAlertType": "System"
}
elasticmachine commented 4 months ago

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

lucabelluccini commented 2 months ago

It fails at:

{
          "processor\_type": "json",
          "status": "error",
          "if": {
            "condition": "ctx.o365audit?.containsKey('Data') == true",
            "result": true
          },
          "error": {
            "root\_cause": \[
              {
                "type": "x\_content\_parse\_exception",

A possible workaround to make the json decoding work would be to update the ingest pipeline logs-o365.audit-... with:

      {
        "gsub": {
          "field": "o365audit.Data",
          "pattern": "\\"QueryTime\\":\\"\[0-9/\]+ \[0-9:\]+ \[AP\]M\\",",
          "replacement": "",
          "if": "ctx.o365audit?.containsKey('Data') == true && ctx.o365audit?.RecordType == '64'"
        }
      },

Just before:

...
{
        "json": {
          "field": "o365audit.Data",
          "if": "ctx.o365audit?.containsKey('Data') == true"
        }
      },
...

With the gsub processor, we're patching any occurrence of the QueryTime which uses the non-ISO8601 format. We can execute it only on the RecordType == 64.