Closed defendable-forfot closed 9 months ago
This issue doesn't have a Team:<team>
label.
I so much agree with you on this! I can open a pull request and try to work on this issue if nobody else has time
However it would help a lot if you could add an example event for each of the problems. Of course I can dig through my own O365, but it's not proving easy ;) but for example, the first issue you mention, does not affect me, because my userids come in the user@domain.tld format
Pinging @elastic/security-external-integrations (Team:Security-External Integrations)
I second this fully. The 365 audit data to ECS field extractions should really be improved as currently it is very hard to work with and customisations have to be made in order for the Elastic Alerts and data to be usable by Security Analysts working in the SIEM.
Another issue to focus on as we work through O365 improvements:
Additional feedback:
Using the standard o365 integration audit logs, the field o365.audit.Data contains json data that is pertinent to the event. The issue is that this field is mapped as a keyword and is not further processed. This field needs to be flattened and the json object should also be ingested into individual fields. This will allow for the better alert analysis required by humans.
Suggested mappings:
IP
o365.audit.Data.sip - ip
Date
o365.audit.Data.ts - date
o365.audit.Data.te - date
o365.audit.Data.at - date
o365.audit.Data.ttdt - date
o365.audit.Data.md - date
Keyword
o365.audit.Data.tid - keyword
o365.audit.Data.lon - keyword
o365.audit.Data.op - keyword
o365.audit.Data.an - keyword
o365.audit.Data.ad - keyword
o365.audit.Data.sev - keyword
o365.audit.Data.rid - keyword
o365.audit.Data.reid - keyword
o365.audit.Data.cid - keyword
o365.audit.Data.tht - keyword
o365.audit.Data.etype - keyword
o365.audit.Data.eid - keyword
o365.audit.Data.f3u - keyword
o365.audit.Data.als - keyword
o365.audit.Data.wl - keyword
o365.audit.Data.ut - keyword
o365.audit.Data.suid - keyword
o365.audit.Data.ail - keyword
o365.audit.Data.von - keyword
o365.audit.Data.sitmi - keyword
o365.audit.Data.dpn - keyword
o365.audit.Data.trc - keyword
o365.audit.Data.aii - keyword
o365.audit.Data.tsd - keyword
o365.audit.Data.ms - keyword
o365.audit.Data.dm - keyword
o365.audit.Data.ttr - keyword
o365.audit.Data.tpt - keyword
o365.audit.Data.tpid - keyword
o365.audit.Data.thn - keyword
o365.audit.Data.imsgid - keyword
o365.audit.Data.fvs - keyword
o365.audit.Data.zu - keyword
o365.audit.Data.pud - keyword
o365.audit.Data.sict - keyword
o365.audit.Data.plk - keyword
o365.audit.Data.mat - keyword
o365.audit.Data.alk - keyword
o365.audit.Data.zmfn - keyword
o365.audit.Data.zmfh - keyword
o365.audit.Data.zfn - keyword
o365.audit.Data.zfh - keyword
o365.audit.Data.sid - keyword
o365.audit.Data.etps - keyword
o365.audit.Data.upfv - keyword
o365.audit.Data.upfc - keyword
o365.audit.Data.ot - keyword
o365.audit.Data.od - keyword
Keyword - this had no analytical value in my instances, but could be helpful for other customers
o365.audit.Data.tdc - keyword
o365.audit.data.af - keyword
o365.audit.Data.ssic - keyword
o365.audit.Data.cpid - keyword
o365.audit.Data.srt - keyword
@defendable-forfot, @WildDogOne & @khalavak,
If you can provide example data for the Data.*
, Parameters.User
or Parameters.DomainName
fields, that would be very helpful.
For example, the data I've seen shows Data.f3u
and Data.suid
fields having values like user@domain.tld
, rather than SecurityComplianceEvent
, SecurityComplianceInsights
or SecurityComplianceEvent
as mentioned in the issue description.
I haven't been able to find documentation of the various Data.*
fields, except for the Office 365 Management Activity API schema documentation describing Data
as being one of:
If documentation of these individual alert or investigation fields does exist, any tips would be much appreciated.
Below I have attempted to restate and respond to each of @defendable-forfot's suggestions.
Many of the suggestions relate to undocumented fields or values that may vary between environments and for which sample data is not currently available.
The relevant upstream documentation is Office 365 Management Activity API schema. For the o365.audit.Data
field we only have a small amount of example data, which is listed under "Known example values for the Data parameter" in https://github.com/elastic/integrations/pull/8571.
In this round of improvements I intend to:
The original suggestions refer to Filebeat's Office 365 module but I will attempt to apply them to the preferred, Agent-based Microsoft 365 Elastic Integration. Wherever the suggestions don't seem to apply, the change to the Agent-based implemenation may explain the mismatch.
Responses to each suggestion are inline in > bold.
o365.audit.Data
fieldsThe most interesting data related to the events seem to be all placed within the o365.audit.Data
field. This makes search and extraction of data from the log source difficult. Ideally the parsing should be done directly in the Filebeat module.
> There is a PR to parse and index this data in the Microsoft 365 integration, here: https://github.com/elastic/integrations/pull/8571
related.*
ECS fieldsECS could be improved by adding related.domain
or related.url
fields, to be used by data sources, including the o365 module, that send events with multiple URLs.
> The closest existing field is related.hosts
, which is for "All hostnames or other host identifiers seen on your event. Example identifiers include FQDNs, domain names, workstation names, or aliases."
> I've added an ECS issue, Add related.url
field, to discuss this proposal further.
o365.audit.Name
When o365.audit.Name
exists, its value populates rule.name
.
In such cases the message
field could also take that value, instead of New alert
.
> The ECS message
field value description says "For structured logs without an original message field, other fields can be concatenated to form a human-readable summary of the event.".
> Currently, message
is set to the value of the incoming field Comments
for SecurityComplianceAlerts events (an example value is "New alert"), or the incoming field ExchangeMetadata.Subject
for ComplianceDLPExchange events (the value being an email subject line).
> The Comments
and Name
values could be concatenated into message
for a richer description, but this cosmetic improvement would come at the cost of having the Comments
value unavailabe in its unmodified form. I think it's best not to change this for now.
Unless otherwise indicated, these suggestions relate to the population of the ECS fields user.domain
, user.email
, user.id
, user.name
, and related.user
.
user.name
and user.email
valuesIn some cases, including some involving Exchange, user.name
and user.email
values have a domain prefix (domainname\
) which should be removed and used to populate user.domain
.
> Note: this suggestion was given in connection with the o365.audit.UserKey
field and the O365 Exchange Suspicious Mailbox Right Delegation detection rule.
> The current logic for populating user.email
, user.name
, and user.domain
will map an incoming value of username@inetdomain.com
to "user.email": "username@inetdomain.com"
, "user.name": "user"
, and "user.domain": "inetdomain.com"
> Although user.domain
is an appropriate field for storing both a Windows networking domains and Internet domains, before attempting to extract Windows networking domains from user.name
and user.email
values I would like to 1) have example data (none of our current examples have the Windows networking domain prefix), and 2) be able to clearly distinguish between a Windows networking domain prefix separated by a backslash and other uses of a backslash (valid email addresses may contain backslashes in the user name).
o365.audit.UserId
and o365.audit.UserKey
non-user valueWhere UserId
or UserKey
matches /^SecurityCompliance.*/
, that value should not be set in user.id
.
The actual user data may be available in o365.audit.Data
.
> Note: The UserId
point was noted as being the case for "record types that are not related to Microsoft Exchange, Azure and SecurityComplianceCenterCommand". There is a large number of such record types.
> Currently, there is no reference to UserKey
in the pipeline configuration. Its incoming value is retained as o365.audit.UserKey
. The UserId
field is renamed to user.id
.
> The Management Activity API schema: Common schema documentation describes UserId
as "The UPN (User Principal Name) of the user who performed the action (specified in the Operation property) that resulted in the record being logged; for example, my_name@my_domain_name. Note that records for activity performed by system accounts (such as SHAREPOINT\system or NT AUTHORITY\SYSTEM) are also included."
> Although values such as SecurityComplianceAlerts
seem to refer to a service or function rather than a user or even a system account, I think the choice of this value for UserId
in upstream API logic should not be overridden in the pipeline logic.
o365.audit.Parameters.User
has user dataA value in o365.audit.Parameters.User
can be put in related.user
.
In cases where o365.audit.Workload="Exchange"
that value will related to the user on which the action is being performed.
> Available example data includes values for this field such as:
EURPR01A002.prod.outlook.com/Microsoft Exchange Hosted Organizations/testsiem.onmicrosoft.com/Discovery Management
EURPR01A002.prod.outlook.com/Microsoft Exchange Hosted Organizations/testsiem.onmicrosoft.com/Discovery Management
> I will open a PR for this change. PR: https://github.com/elastic/integrations/pull/8803
o365.audit.Data.*
user dataThe following fields are suggested to contain user data, in particular when o365.audit.Workload="SecurityComplianceCenter"
and o365.audit.RecordType!="24"
:
o365.audit.Data.f3u
Unless its value is SecurityComplianceEvent
.o365.audit.Data.suid
Unless its value is SecurityComplianceInsights
or SecurityComplianceEvent
. Also, client.user
should be populated only if o365.audit.ClientIP
is also populated.o365.audit.Data.isda
As an array of user objects that should populate related.user
.o365.audit.Data.tsd
Unless it's value is <>
or a "partially parsed utf8 string". Represents the sender of an email and should populate related.user
.o365.audit.Data.trc
Represents the recipient of an email. Also, client.user
should be populated only if o365.audit.ClientIP
is also populated.> Note: The RecordType=24
corresponds to member name "Discover", described as "Events for eDiscovery activities performed by running content searches and managing eDiscovery cases in the Security & Compliance Center."
> The Data.isda
field is not in the list of known fields used for https://github.com/elastic/integrations/pull/8571, but that PR will make its value available in o365.audit.Data.flattened.isda
. Before indexing that field directly under o386.audit.Data
, it would be good to receive confirmation of its use, and example data.
> Although the presence of an incoming ClientIP
suggests there is an initiator of a network connection related to this event, the client.user
field set seems redundant when not used to distinguish between an initiator (client) and a responder (server).
> Available example data shows f3u
, suid
, tsd
and trc
as having values that match the format of an email address. The user.email
and user.id
fields could potentially be populated with these values, but given their undocumented and uncertain meaning, I think a better choice is to add values that appear to be email addresses into related.user
to aid discovery and allow integration users to do any further interpretation of these values themselves.
> I will open a PR to add f3u
, suid
, tsd
and trc
values to related.user
when they are in email address format. PR: https://github.com/elastic/integrations/pull/8803
Unless otherwise indicated, these suggestions relate to the population of the ECS fields url.domain
, url.extension
, url.original
, url.path
, url.scheme
, and url.subdomain
.
o365.audit.Parameters.DomainName
has domain dataWhen present, use it to populate the relevant ECS fields.
> The Parameters
field contains the "name and value for all parameters that were used with the cmdlet". For the Exchange Admin schema this is a cmdlet that that is identified in the Operations property. For the Security and Compliance Center schema it is noted this will not include PII.
> There is no example data available for this field. It's unclear whether a DomainName
value would refer to the domain of a URL and be suitable for url.domain
, or to a Windows Networking domain which would not. I would want to confirm the meaning of this field and have example data before populating ECS fields with its value.
o365.audit.Data.*
URL dataThe following fields are suggested to contain URL data:
o365.audit.Data.zu
Whenever it's populated.o365.audit.Data.reid
Concatenated with Data.rid
data, in cases where o365.audit.Data.zu
is not populated and the event relates to a URL not to a file, and in particular, when o365.audit.Workload="SecurityComplianceCenter"
and o365.audit.RecordType!="24"
.o365.audit.Data.alk
When populated it contains a URL for the actual event, which should be used to populate the event.url
field.> In example data, we have reid
values of "cannot be shared"
(from a public blog post, likely not the value delivered by the API) and "23a5e271-e297-4f35-ff57-08d7b17f5bf2"
(from test data). If reid
can contain a concatenation of different types of data, it may be difficult to dependably extract a URL from it. For zu
and alk
we have no example data.
> A URL value from an undocumented field may be easier to use than other values because a URL is data of specific format that is strictly defined. However, before attempting to extract URLs from zu
, alk
or other fields I would want to have some example data that confirms their presence.
Hey @chrisberkhout - do you think we can close this issue on the back of the v2.1.0 update to O365, or are there still some outstanding items to address?
I think this is done for now. We can revisit it in the future if we get more feedback and data. The changes made were:
We are ingesting O365 data into our Elasticsearch for search, detection in Elastic Security and visualiation through Kibana. However, we have noticed a few areas for improvement within the module. What is most interesting with this module is how data is ingested. The most interesting data related to the events seem to be all placed within the o365.audit.Data field. This makes search and extraction of data from the log source difficult. Ideally the parsing should be done directly in the Filebeat module. We believe there is data within the field that can be used to populate other, more relevant, ECS fields.
Note: we are running filebeat version 8.1.3, but have noticed that none of the newer releases solves our issues.
Additionally, we believe the ECS specification should be improved with the introduction of a new field within the Related fields section. Certain third-party data sources, the O365 module included, send events where multiple URLs are present. An optimal solution would be to add this data to a related.domain or related.url field, none of which currently exist.
This is a copy of https://discuss.elastic.co/t/office-365-filebeat-module-improve-ecs-utilization/315126, as I was recommended to post this as a GitHub issue instead.