elastic / detection-rules

https://www.elastic.co/guide/en/security/current/detection-engine-overview.html
Other
1.92k stars 492 forks source link

[Hunting] Re-factor Hunting Library Code #4085

Closed terrancedejesus closed 5 days ago

terrancedejesus commented 3 weeks ago

Pull Request

Issue link(s):

Summary - What I changed

Re-factors threat hunting library code for modularity and expanded functionality for hunters. Below is an incomplete list of changes:

## Additional Information ### Fixing Okta folder structure From https://github.com/elastic/detection-rules/pull/4064, the folder structure was merged in as `okta/docs/docs` and `okta/docs/queries`. As a result, when we would create the index files, `docs` was the header and it also broke searching. To fix this, we adjusted the filer structure to be `okta/docs` and `okta/queries` as expected. ### Adding `index.yml` Prior to this PR, we had `index.md` which was a markdown file that stored an index of all of our hunts. This allowed visitors to see a centralized list of available hunting queries grouped by data source. However, for programmatic use in logic, it works better to also have an data-based index that can be easily referenced and loaded, thus the introduction of `index.yml`. In this index, we have the following... - UUID - Hunting Analytic Name - Location (File Path) - MITRE (Technique & Sub-Technique IDs) This YAML file is then used for refreshing the index and the markdown file, identifying file locations, also making it easier to search for specific queries based on MITRE information, such as the new `search` command. This will be the source of truth for hunting queries library. We have also added `refresh-index` command which updates the YAML index from all hunting queries and then updates the markdown. Screenshot 2024-09-18 at 2 52 21 PM ### Add CLI to Hunting Module Prior to this PR, `hunting/` was a top-level module. An additional `__main__.py` and `click` module allows us to call the module from the top-level of the repo as such: `python -m hunting`. As such it is similar to `rta` and stands as a new feature of the detection rules repository. However, this still allows us to leverage modules or libraries from within `detection_rules` as we do with `attack.py` to not have so much redundancy. Commands can be added to `__main__.py` if we want to extend features and capabilities. There are other OOP changes such as adding `utils.py` and `definitions.py` to replicate the same structure as the `detection_rules` module but keep it simple. Screenshot 2024-09-18 at 2 51 00 PM ### Adjust Markdown Generation Prior to this PR, we were only capable of generating the markdown from TOML files for ALL queries at once, which required us to load all TOML hunting files and overwrite markdown files which was unnecessary. Rather, we have added `generate-markdown` command which then calls `process_toml_files` with parameters for both a single TOML filepath, a data source string such as `aws` or simply all docs. Screenshot 2024-09-18 at 2 50 32 PM ### Added a MITRE `search` command While the repository is meant to be a simplistic representation of all our hunting queries, allowing visitors to copy and paste from markdown or programmatically load TOML files, as we build out more queries and docs, searching for the right hunt may be daunting. As search a simple and effective approach to enable visitors is to allow them to search for specific hunting queries based on the following: - Tactic ID - Returns all hunting queries whose techniques or sub-techniques fall within the bucket of that tactic ID - Technique ID - Returns all hunting queries whose technique IDs match or sub-techniques whom fall into the bucket of that technique - Subtechnique ID - Returns all hunting queries whose sub-technique IDs match - Data Source - Allow an additional filter for data source, such as `aws` or `okta`. This proves useful if you want to search, for example, all TA0001 (Credential Access) related to Okta. Screenshot 2024-09-18 at 2 48 41 PM ### Run Query In order to do this, you must add a `.detection-rules-cfg.yaml` config with your Cloud ID and API key for authentication. `python -m hunting run-query --file-path /Users/tdejesus/code/src/detection-rules/hunting/linux/queries/low_volume_external_network_connections_from_process.toml` Screenshot 2024-09-25 at 2 28 49 PM `python -m hunting run-query --uuid ef579900-75ef-11ef-b47f-f661ea17fbcc ` Screenshot 2024-09-25 at 2 28 14 PM `python -m hunting run-query --uuid 6e57e6a6-f150-405d-b8be-e4e666a3a86d` Screenshot 2024-09-25 at 2 31 58 PM ### View Hunt ` python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --format json` Screenshot 2024-09-25 at 2 29 43 PM Screenshot 2024-09-25 at 2 30 10 PM `python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --query-only ` Screenshot 2024-09-25 at 2 30 10 PM
## How To Test ### Run Unit-Tests `python -m unittest tests/test_hunt_data.py` ### Search for a specific tactic, technique or sub-technique ID in MITRE `python -m hunting search --subtechnique T1078.004 --data-source aws` `python -m hunting search --tactic TA0001 --data-source okta` `python -m hunting search --technique T1078` ### Refresh the index (feel free to remove index entry and see if it populates again) `python -m hunting refresh-index` ### Generate markdown `python -m hunting generate-markdown /Users/tdejesus/code/src/detection-rules/hunting/okta/queries/initial_access_higher_than_average_failed_authentication.toml` `python -m hunting generate-markdown okta` `python -m hunting generate-markdown` ### Run Query In order to do this, you must add a `.detection-rules-cfg.yaml` config with your Cloud ID and API key for authentication. `python -m hunting run-query --file-path /Users/tdejesus/code/src/detection-rules/hunting/linux/queries/low_volume_external_network_connections_from_process.toml` `python -m hunting run-query --uuid ef579900-75ef-11ef-b47f-f661ea17fbcc ` `python -m hunting run-query --uuid 6e57e6a6-f150-405d-b8be-e4e666a3a86d` ### View Hunt `python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --format json` `python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --query-only `

Checklist

Contributor checklist

protectionsmachine commented 2 weeks ago

Enhancement - Guidelines

These guidelines serve as a reminder set of considerations when addressing adding a new schema feature to the code.

Documentation and Context

Code Standards and Practices

Testing

Additional Schema Related Checks

terrancedejesus commented 2 weeks ago

Putting this back into Draft, going to work on the following to finalize:

imays11 commented 2 weeks ago

Great tool @terrancedejesus, awesome way to make the hunts more easily accessible for users! Something I noticed while using the hunting tool is that the file paths provided by the search command assume the working directory is the hunting directory. However, if you're using the tool from the root directory, detection_rules, then these links aren't as convenient to copy-paste > cat file. Maybe changing the links to be the full path starting at the root directory? This way even if someone is operating in the hunting folder, or another sub-folder, it's easier to copy the portion of the path they need, rather than someone having to cd to the hunting directory before conveniently using the file path.

terrancedejesus commented 1 week ago

@Mikaayenson @shashank-elastic @eric-forte-elastic - I have addressed the feedback recorded as noted below. Additionally, Mika had mentioned making the package more modular and thus I have updated them to have a class with that can be instantiated with methods to handle most of the logic that is called from within each command. Thanks again for the feedback!

terrancedejesus commented 1 week ago

Nice addition of summary @Mikaayenson !

terrancedejesus commented 6 days ago

@traut

I think we should stop using click as a logging solution (we do the same in REACT). It's great for parsing CLI arguments (though I'm using typer in cortado for simplicity), but it lacks the flexibility and power a proper logging library would provide -- my favourite here is structlog.

We have discussed logging in detection rules for awhile now and I agree, proper logging would be great for using the repo or the many supported features such as RTA, Detection Rules, Hunting, etc. This would be a good future improvement, but IMO not to be introduced in this PR.

if we're setting type hints we might as well enforce the correctness with pyright. I see it in REACT as well. Without enforcement, they are just there for decoration :)

Agreed for enforcement, but out-of-scope for this PR if I understand. More as an overall repository addition since it requires introducing pyright to the repo, adding a config, adjusting CI/CD builds, etc.