elastic / ecs

Elastic Common Schema
https://www.elastic.co/what-is/ecs
Apache License 2.0
1.02k stars 418 forks source link

DMARC Fields #593

Closed homerjonathan closed 4 years ago

homerjonathan commented 5 years ago

I am working on processing DMARC files. DMARC reports are used to stop people spoofing emails. Most of the fields can be matched over well with pre-existing fields. However there are some specific fields that are specific to DMARC that are not included. I was considering:

dmarc.dkim.selector - https://www.dmarcanalyzer.com/what-is-a-dkim-selector/ dmarc.dkim.result - pass or fail dmarc.spf.result - pass or fail dmarc.spf.result.details - Type of fail pass/fail/softfail

I can work on filling out the details. I just wanted feedback if this would be interesting/useful to add, or if I am going in the wrong direction.

webmat commented 5 years ago

Yes I'd like to hear more of what you have in mind.

We eventually want to have more support for email in ECS, and DMARC is part of it.

I'd like to see an example of a complete DMARC event like you envision it (including existing ECS fields & the DMARC fields you envision).

webmat commented 5 years ago

@andrewvc Is checking things around email reputation and DMARC compliance on the horizon for Heartbeat?

andrewvc commented 5 years ago

@webmat it's an interesting idea, but not at the moment. We're trying to stay focused on our core use case.

homerjonathan commented 5 years ago

Perhaps there is an idea there. "Been focused on our core use." .

Would it be worth having a "Misc" standard? So we don't interfere with the flow of the core use. So extra fields that are wanted could be put in a possible but not confirmed pile. This will free up advocates of the ECS standard to move forward.

So users can use the standard with a strong advisory warning that this was a proposed standard that could change? You can split then the discussions leaving the core team to continue with the important stuff.

webmat commented 5 years ago

@homerjonathan The fact that Heartbeat has no immediate plan to focus on this doesn't mean it's not going to be added to ECS, don't worry. I was just trying to see with Andrew if we had a potential synergy there. I'll go with just having an inception for now ;-)

Also, we have been thinking about adding ways to delineate what in ECS is related to which use case, so that people can pick the parts that are relevant to them. We don't have anything for that yet, but it's brewing :-)

So your input is very welcome on how you see support for DMARC. I have experience with email reputation and DMARC monitoring.

homerjonathan commented 5 years ago

Starting to create the fields. What do you feel works better?

dmarc.selector or mail.dmarc.selector

Mail is more expandable in that you could add mail.destination and so on. But we could end up with a overly deep tree structure. dmarc. creates a flatter tree but makes the root busier.

Especially with values such as:

mail.dmarc.spf.result seems too long.

Or perhaps is this nicer

mail.dmarc.spf_result

So if we add fields for DMARC the next step could be for example monitoring emails for Spam or Malware etc. So a home of mail. may be better. So we could use the threat. say we wanted to log that this mail had a Malware attached. Then use the mail.from etc. to record details useful to that area.

Thus my recommendation would be to create mail. as the main branch. With mail.dmarc. as the specific location for mail reputation etc.

anhlqn commented 5 years ago

For DMARC and I the free service from Postmark and just keep the fields that they return from their API. Is it needed to standardize DMARC fields?

homerjonathan commented 5 years ago

We still need a standard for DMARC fields. Some of the fields are bespoke for DMARC however there are ones for example like the hostname etc, which already have naming conventions. So if you were configuring a Siem system, attacks on a port SMTP port from a different log stream. Merging that with the logs from DMARC files with the same IP address would be useful to highlight. If we used the DMARC fields the IP Address field might be slightly named differently. They obviously could still be linked with a different name. But from what I can see the ECS system is about uniformity making it easier to link between separate log streams by shared field naming scheme.

Does that make sense? Or do I have the wrong idea?

anhlqn commented 5 years ago

I like email.dmarc.spf_result which nests the DMARC field set under the email. On the email field set discussion, someone suggested email instead of mail

homerjonathan commented 5 years ago

Agreed. Will change mail to email.

andrewstucki commented 4 years ago

So, having worked in the bowels of DKIM/SPF and DMARC policies for a number of years, just want to point out that having DKIM/SPF fields under a "DMARC" field set doesn't fully make sense.

DKIM and SPF both predate DMARC and you don't have to have a DMARC policy to institute DKIM or SPF. Rather DMARC policies tell a mail recipient how to treat 1. DKIM or SPF failures from a given domain, and 2. what requirements around From address alignment for said records to adopt.

That said, it makes sense to me that DMARC, SPF, and DKIM fields should all be nested at the same level under something like an email field set since, in practice, you can implement any of them without the others.

homerjonathan commented 4 years ago

Thanks for your feedback. Your right a root level of email. for this would be better. I will adjust the design.

anhlqn commented 4 years ago

@andrewstucki I think the reason @homerjonathan nested SPF and DKIM under DMARC is that SPF and DKIM don't provide any kind of reports like DMARC and the pass/fail of SPF and DKIM is recorded in the DMARC reports. However, you have a good point there that both should be nested under email.

homerjonathan commented 4 years ago

I have hit an interesting issue dealing with DKIM records. DKIM is where the headers of Email are signed to prevent change and forgery. In the DMARC reports sent there can be either zero, 1 or many DKIM entries. So far I have seen 0, 1 and 2, but no more. I am assuming that more could be received. So if you have 1 or 2 it is possible for all of them to pass the DKIM signature test but fail the overall DKIM result. This indicates the message was correctly signed by email servers as the message is forwarded around. However fails DKIM because none of the signatures belong to the organisation sending the email that the DMARC report is reporting on.

My problem is how this could be recorded in the ECS schema. How do you store multiple entries in this schema standard? Is there a email.dkim.1.domain etc. Or should the results be compressed thus the domains would be email.dkim.domains = "domain1.dom,domain2.com" etc.?

Your thoughts?

webmat commented 4 years ago

It depends on what you're trying to record for a given DKIM entry. If all you need is to store a single value for each (a domain such as "domain1.com"), then you can store them as an array.

myfield = ["domain1.com", "domain2.com"]

Elasticsearch and Kibana deals well with arrays. A search on myfield:domain2.com would return any document that has "domain2.com" in any position in the array.

So if you have one or a very limited amount (e.g. up to 2-3), you can model this with an array or 2-3 arrays of the same length.

If however you have much more than one attribute per entry (e.g. {"domain":"domain1.com", "status":"success", "foo":"baz", ...}), then it depends. Elasticsearch has some support for arrays of objects (start here), but Kibana will not support it very well.

So far there's exactly one place in ECS where we store arrays of objects, and it's in arrays of DNS answers (see here for a quick visual, or just set up Packetbeat to monitor DNS). We've done this because we actually extract the most interesting part of the answers (array of IPs) to another field, and the array of DNS answer object is seen more as a "debugging" section that has all of the details of the response. Simple DNS answers have up to 5 attributes per answer (2-3 actually interesting). But with DNSSEC, the amount of attributes can go up to over 10 attributes.

A quick note on custom fields. Since email.* is not currently part of ECS, you run a risk of conflicting with future versions of ECS by putting your custom fields in under a generic term. Check out this documentation WIP to read some thoughts on the subject. That said, if your system is already live and recording , no need to stress about it either: you may simply have to adjust your field names later, if ECS adds fields in a way that conflicts with what you've created.

homerjonathan commented 4 years ago

Hi @webmat. Thanks for the comments they are very useful.

My intention is to eventually contribute a solution towards the ECS design. Or at least get some of the more basic design completed ready for implementation. So email.* would be where it should live.

As I am currently working though DMARC processing here at my workplace I can through the experience find out what fields are required so hopefully it will be a successful schema. I am using the current ECS as a template to create a solution that will be in the same design/format/feeling of the current system.

webmat commented 4 years ago

We created meta-issue #939 to discuss email support in ECS. Closing in favor of the meta issue.