MISP / MISP

MISP (core software) - Open Source Threat Intelligence and Sharing Platform
https://www.misp-project.org/
GNU Affero General Public License v3.0
5.33k stars 1.39k forks source link

More meta information from the event on attributes/restSearch #4935

Closed Aisik00 closed 5 years ago

Aisik00 commented 5 years ago

Hello,

I would like to request a little feature in PyMISP search method.

First I will try to describe the problem. We are currently running our own IP reputation database, which should now partly get IP addresses info from our MISP instance. One module listens for updates in our MISP instance via ZMQ and if there is some new IP address, it will process it and insert into database. But this module listens only for updates, so there is second script (for getting all the data already stored in MISP instance), which goes through all IP addresses in our MISP instance and processes them all in once. Then just the first module listens for updates.

The second script downloads all ip addresses it can find in our MISP instance (ip-src, ip-dst, domain|ip, ip-src|port, ip-dst|port) and then tries to store desired information in structured way, which needs some basic event information. So the IP address is stored with information about event, where it appeared. Information it needs are:

Script asks our MISP instance for all ip addresses via: ip_src = misp_inst.search(controller="attributes", type_attribute="ip-src")

This gives me all "ip-src" attributes, but there is not all the information which needs to be stored, so for every IP address (currently around 250 000 in our instance) I have to query MISP instance for event details of the attribute. Becuase of these queries for event details for each IP address the script currently runs too many hours, even days, before it completes the task.

The search method I mentioned above already returns some basic event info of searched attribute in 'Event' key like org_id, distribution, event_id, event_info, orgc_id, event_uuid, which it is not enough for our IP address structure. It would be really nice, if the search method would return all metada about event (or atleast info mentioned in structure above like list of tags, date, ...), which could be then easily used and it would prevent 250 000 more queries for more detailed info. This would then cut run time of the script from days to minutes.

Could you please consider a little update in PyMISP search method, because it would really help my problem.

Thank you for any help with this issue, Pavel

Rafiot commented 5 years ago

When you run your query, look at the content of the response. Every attribute in the list has a key "Event" that contains the metadata you're looking for.

Let me know if I'm mistaken.

Aisik00 commented 5 years ago

"Event" key contains only "org_id", "distribution", "id", "info", "orgc_id" and "uuid".

I need more than that, exactly:

I also checked version of PyMISP package and should be present.

Rafiot commented 5 years ago

Alright, changing to to feature request.

github-germ commented 5 years ago

@Aisik00 -- wondering if you also have an interest in the results including the RelatedEvents, i.e. correlations? See https://github.com/MISP/PyMISP/issues/415

Rafiot commented 5 years ago

Yep, let's keep that as a feature request here.

Aisik00 commented 5 years ago

@Rafiot -- Thank you for accepting the request. @github-germ -- I am not really interested in 'RelatedEvents' now, but in the future may be.

Rafiot commented 5 years ago

No promises on when it will be implemented tho.

We will need a way to filter out the content of the response, because the metadata from the event could endup being huge (especially the RelatedEvents) and most won't want everything all the time.

github-germ commented 5 years ago

@Rafiot -- Thank you for accepting the request. @github-germ -- I am not really interested in 'RelatedEvents' now, but in the future may be.

I second the motion of appreciation!!

Our use case is to perform specific search queries via PyMISP and deliver rich JSON to downstream internal systems to match with additional threat intel from other platforms, perform algorithmic analysis resulting in protect and detect actions. Having correlated objects is essential for the algorithms to become effective.

Aisik00 commented 5 years ago

@Rafiot -- Thank you for accepting the request. @github-germ -- I am not really interested in 'RelatedEvents' now, but in the future may be.

I second the motion of appreciation!!

Our use case is to perform specific search queries via PyMISP and deliver rich JSON to downstream internal systems to match with additional threat intel from other platforms, perform algorithmic analysis resulting in protect and detect actions. Having correlated objects is essential for the algorithms to become effective.

I agree that 'RelatedEvents' are important, but it really depends on use case. In our IP reputation database we use these metadata about event as just basic overview of all the events, where the IP address appeared directly. If someone wants to see RelatedEvents, he is supplied with direct link to MISP instance to the event, where he can view all RelatedEvents. And MISP got it greatly and nicely implemented, so in our use case there is no need to got all RelatedEvents stored in database, if they do not directly contain any IP address.

But thanks for your proposal. :)

github-germ commented 5 years ago

We will need a way to filter out the content of the response, because the metadata from the event could endup being huge (especially the RelatedEvents) and most won't want everything all the time.

Data size is not an issue for us: accurate, rich, correlated data is the prime goal. Could we make including a RelatedEvent (and perhps others) selected via something like:

search(..., includeFilter={'RelatedEvent': True, ...}, ...)

I do see RelatedEvent included in the cache_json output. Is exposing that to PyMISP a lot of work?

Rafiot commented 5 years ago

Well, it's not an issue for you, but it definitely is for some users. So it cannot be the default, and we need a clean switch that supports your use case, and other people use cases. Implementing that part on MISP side is a lot of work, yes. As usual, PR or funding proposals are welcome if you want to speed up the implementation of that specific functionality.

Which cache_json are you talking about?

github-germ commented 5 years ago

As usual, PR or funding proposals are welcome if you want to speed up the implementation of that specific functionality.

I hear you. Believe me I wish I could do the PR, but I already work mega hours on my project. As to funding, this company has not currently agreed to my suggestions on that front... :-(

Which cache_json are you talking about?

I was looking at /var/www/MISP/app/tmp/cached_exports/json/*.json

Rafiot commented 5 years ago

Oh, yeah, that is the same code path as when you get an event. It is a totally different from the search query,

dtclayton commented 5 years ago

I have ran into this exact same issue, it would be so great if the data included the threat level assigned to each event that an attribute is within.

iglocska commented 5 years ago

Should be fixed now. All major event fields can now be included along with event tags and the basic orgc info, by passing "includeContext": 1 in the request.

github-germ commented 5 years ago

@iglocska -- cool!! Does this include RelatedEvent for correlations? Thank you...

iglocska commented 5 years ago

No, it includes the ones asked for in this issue.

Aisik00 commented 5 years ago

Great, I will test it ASAP. Thanks a lot, I will let you know, if it works as expected. :)

github-germ commented 5 years ago

As per @Rafiot in https://github.com/MISP/MISP/issues/4935#issuecomment-515075219 above we had merged in a separate feature request here for corelations to be included. Can we re-open https://github.com/MISP/PyMISP/issues/415 ?

Aisik00 commented 5 years ago

I got a newbie question, but where can I obtain that updated version? :)

packet-rat commented 5 years ago

Seconding @github-germ request for ability to return 'RelatedEvents'. MISP Correlation is one of MISP's high value propositions and the ability to leverage same via pyMISP is integral to automating operationalization for "us".

Second motion to re-open Feature Request MISP/PyMISP#415 https://github.com/MISP/PyMISP/issues/415

Rafiot commented 5 years ago

Done.

Aisik00 commented 5 years ago

I tested that updated version, evyrything of what was listed above is included except one small detail. There is missing sightings of the attribute. Would it be possible to add sightings of the attribute as well?

iglocska commented 5 years ago

Implemented for both. Usage: