yeti-platform / yeti

Your Everyday Threat Intelligence
https://yeti-platform.io/
Apache License 2.0
1.72k stars 288 forks source link

Usage: How are you representing an "event" or a "sighting" ? #152

Closed williamsdr closed 6 years ago

williamsdr commented 6 years ago

I find it useful to track the incidents/events when an observable was noted. Do you/how are you using YETI to reflect this type of detail?

e.g. www. baddomain[.]com was found during an incident on 01 October.

Ultimately, this would be link back to the actors, TTP, and malware entities. If www. baddomain[.]com is later observed on 18 October, knowing these may yield trends or support prioritization of response actions.

We do manage our case details separately - so this is not about full case management. However having history of observations would be useful in our case.

tomchop commented 6 years ago

That's a good question. A way to track this would be to push a "sightings" context to an observable (you can then reference a date, an incident ID or reference, etc.). Your case management system can leverage Yeti's API to set this; this is what FIR does out of the box.

You can then search for all observables related to a given incident ID or reference by searching for context__reference=1234 (replacing reference by the key you used to set the reference).

williamsdr commented 6 years ago

Thanks - I'll take a look at FIR and how it integrates.

tomchop commented 6 years ago

Cool. Also have a look at pyeti, a Python lib for interacting with Yeti (still a work in progress but should be fairly simple to integrate to any existing codebase)

williamsdr commented 6 years ago

Yes, I'm already using pyeti for some basic testing/query. I queries a few thousand IP's against 1M entries in YETI through the pyeti bindings. It was a great help for simplification.

ag100 commented 6 years ago

Hey Tom - Are there any disadvantages to doing this association via campaigns and binding via tags? The thought being that an incident management system would create a campaign in Yeti matching the incident ID, binding any observables found to the campaign via a tag. As the analyst works and updates the case, the high level view in Yeti would show associated entities and allow for easy pivoting, etc... Thanks!

tomchop commented 6 years ago

@ag100 yes, you can do that as well. We designed the Campaign entity with spamruns in mind, but they are another way to do it especially if your team has a ~1:1 mapping from incidents to campaigns (this might not be the case, e.g. with campaigns that are public but havent impacted an org, or an incident that requires threat intelligence management but has no obvious campaigns associated to it)

We could also create an Incident entity to allow tracking campaigns that have generated an incident as well as those that haven't but are public knowledge. Would this be useful, or overkill?

ag100 commented 6 years ago

Thanks Tom - Personally, I think an Incident or Event entity to track those would be very helpful, and allow us to track those items better. That said, I understand it's probably not a priority, and that others may feel differently. :) Thanks again!

axpatito commented 6 years ago

Hi guys, jumping in here. I have the same requirement as @ag100. I've already developed a branch that creates an incident entity, I just haven't figured out yet what would be the better way to create relationships between observables.

I will be working on this all week, if you guys have any suggestions on how to manage historically found data, it would be amazing. My top question to solve would be, how can I correlate an incident with another incident based only on the observables. Eg. ip addr 172.16.0.1 was seen and investigated on incident T-0001, a couple of months later we receive a new incident T-0051 that contains 172.16.0.1 as an observable. I would like to show automagically the correlation of incidents when creating them via API.

Any feedback would be greatly appreciated.

ag100 commented 6 years ago

Hey @axpatito - How are you looking to show the correlation? We're currently doing that correlation via tags, but am still really really figuring things out. For example, if your incident management system created 172.16.0.1, you would have it tagged with t-0001. When it went to re-add the observable when the new incident was created, it would add the t-0056 tag, and the second set of context. When you then view the observable, you would see all of the incidents it's been related to by the tags and/or context. Is that along the lines of what you're thinking of or did I misunderstand?

axpatito commented 6 years ago

Yes, that sounds about right with the current implementation.

I'm still trying to figure out what would be the best way to achieve this, my personal feeling is that tags are too wide, and I'm looking to automagically find correlation between entities, based on observables. I don't feel that I can achieve this with only tags.

williamsdr commented 6 years ago

@axpatito I like the concept - it parallels the Campaign with a slightly different usage (same concept). Can you clarify your challenge with relationships? If it's as @ag100 described, tags seems to be working for that - An observable shows related entities over in the right hand size of the view. Are you looking for those relationships to appear on the "Related Observable" tab? I see the "Related Observables" being used differently: A file imported will show alternate hashes (MD5, SHA1, SHA256) for the same object.

axpatito commented 6 years ago

Maybe I'm missing some experience with the tool, but the cases I've tried, IMO, are not very friendly to correlate to other entities.

Still, I have a requirement to create the incident entity and use those entities as starting point for analysis.

Maybe I'm miss using the intended way to correlate past events with new ones?

tomchop commented 6 years ago

@axpatito, my understanding is that you have 127.0.0.1 linked to IncidentA, and when you file IncidentB and add 127.0.0.1 to it you want to show that there is a link between both without leaving the Entity (in this case, Campaign?) page, or when querying info on IncidentA via the API.

I agree that tags are a way to show the link, but it seems a bit artificial to me (we shouldn't need to use tags if the link already exist de facto in the database).

The good news is that this should be fairly easy to implement, since the whole database is a huge graph. The hard questions are:

  1. How do you want this link to be displayed? In the or next to the "Related {Malware,ExploitKits,etc}" tabs?
  2. If so, how do we make the difference between a "hard" link (i.e. when the entity or observable is one step away, direct neighbor?) or a "soft" link (two or more steps away, indirect neighbor)?
  3. What "distance" should we stop at? Should it be distance-based only, or should we include types too (this will complexify the linking, but I guess it can work). I think pivoting e.g. on an IP address would be too noisy.

I have ideas for all of them but I'd love to hear your suggestions. @gaelmuller might also want to pitch in on this.

axpatito commented 6 years ago

@tomchop I think you made excellent points. I don't have hard answers at the moment, I do have however, a few ideas of where I would like to go.

  1. I would like to show correlation of entities when adding an observable. Following the 127.0.0.1 example; let's assume that I saw it and created an incident with id T0001. Later I get another incident that happens to have 127.0.0.1 as an observable, I would like to show correlated incidents by observables. Then, it should be relatively easy to understand the connection between incidents. I believe this in the discussion point of this issues question.

  2. I believe that there is too much noise regardless of the type, even more when hashes are involved, so IMO would have to be only for hard connections ATM. Maybe later we can develop some sort of machine learning to cluster data and try to neurally relate the observable and try to autocorrelate campaigns, but I see this as a future feature.

I think all this points are worth talking about in order to create a roadmap for the product. What do you think?

williamsdr commented 6 years ago

@tomchop @axpatito Applying tags seems to work as detailed by @ag100. I agree it seems a bit "extra" - however it also worked without any modification. This may become unwieldy a data sets expand and you start to see the observable.

RE: @tomchop 's questions:

1 - How : Yes, I thought "Related entities" would be the right mapping.

2 - #3, Establish relationships between incidents(Campaigns?) for direct / identical observables automatically.

3 - stop at distance between incidents=1.

As @axpatito comments- #2 - Yes, humans should make the association where entity distance > 1 ; ML as an analytic in the future.

tomchop commented 6 years ago

@axpatito

  1. Gotcha
  2. What do you mean by "too much noise especially with hashes"? If a hash is associated with 2+ incidents, isn't it worth highlighting it?
tomchop commented 6 years ago

@williamsdr : so you mean that we should only draw links between Incidents / Campaigns using only observables?

What about a) linking other types of entities using observables or indicators (e.g. two Malware entries that share C2 domains, or have hits on the same Yara rule), b) linking two Incidents by using something else than observables (e.g. Malware / Actor entries, etc).

williamsdr commented 6 years ago

@tomchop Yes, that model is also in my mind. It is very much the graph model you describe. Establishing those relationships are desired. I'm jumping back into these sorts of designs/coding after years away - so I needed to start with the basics.
In the example you give, I'm not sure I would link the INCIDENTS based on secondary artifacts (let a human make that decision). But two incidents with the same domain are related. One domain within two malware samples, would make them related - but not necessarily the direct relationship of the two associated incidents. This is a likely conclusion, but for today - one I'd want a person doing, with the computer helping to identify them.

The "link" collection appears to part of the structure, except perhaps it lacks the label (and or directions) to indicate the type/direction of relationship. (A calls to B), (B contains C)...

The 1:1 mapping seems the right starting point for establishing the relationships. It's believe done today via the tag as a proxy.

I'll work to consolidate some thoughts into a single post overnight. I think many of the bits are in place.

axpatito commented 6 years ago

@williamsdr I would like to +1 on this, I've been working on my own to achieve this relationships with some challenges that I would like someone from the dev team to guide me with in order to keep in line with the actual development.

My ideal scenario would be to create an incident containing both observables and other entities via api. Once uploaded, I would love to get a dashboard about new incidents and offer a screen to show both the incident, and the related incidents. Does this make sense to the project?

tomchop commented 6 years ago

I created #169 with a more appropriate title to track changes in this. I'll close this out in the meantime but feel free to pitch in again or reopen. Cheers!