cuckoosandbox / cuckoo

Cuckoo Sandbox is an automated dynamic malware analysis system
http://www.cuckoosandbox.org
Other
5.56k stars 1.7k forks source link

Rework our Search capabilities #1327

Open jbremer opened 7 years ago

jbremer commented 7 years ago

I.e., search functionality as part of the Cuckoo Core MongoDB abstraction layer including the required unit tests. Will also require unit tests for all possible scenarios and use-cases, e.g.:

In fact, Cuckoo also supports ElasticSearch which is specifically aimed at searching capabilities. To really make this feature awesome we could start on a separate Cuckoo Core class (e.g., search.py) that bundles the search capabilities from ElasticSearch and MongoDB. Note that ElasticSearch is disabled by default but if enabled, its results take precedence over MongoDB's as, again, the main focus of ElasticSearch is searching. This will also require an abstraction layer around ElasticSearch.

Naturally there are some edge cases, e.g., if a hash is searched we can find all available analyses that match that sample using MongoDB (and that's probably the easiest), but using ElasticSearch we can also search the hashes of dropped files etc.

There's most certainly weeks of work in the backend layer for this issue, but once the backend is fully finished off, using its functionality in the frontend should be little to no effort. Hence I'd like to really stress the usual principles but here they're even more important:

Finally I'd like to mention that I pretty much completely broke search capabilities in, I think, either 2.0-rc1 or 2.0-rc2. I changed all searching things from MongoDB to ElasticSearch which obviously is pointless when nobody is enabling ElasticSearch - rendering searching completely broken. However, this does mean that either in Cuckoo 1.2 or 2.0-rc1 we may find various MongoDB-related search capabilities (e.g., search on file hash) that we can easily reuse / port to the new Cuckoo Core search class(es). In the latest Cuckoo versions you may find the ElasticSearch-related search queries that you can port over.

The base classes for Mongo and Elastic exist nowadays and may be found at the following locations, https://github.com/cuckoosandbox/cuckoo/blob/package/cuckoo/common/mongo.py & https://github.com/cuckoosandbox/cuckoo/blob/package/cuckoo/common/elastic.py.

SparkyNZL commented 7 years ago

@jbremer , I just thought that you has decided to go to ES for your search engine :) to be honest it works well, and again, most of the other integration packages like Moloch use it, :)