terrancedejesus commented 3 weeks ago

Pull Request

Issue link(s):

https://github.com/elastic/ia-trade-team/issues/280

Summary - What I changed

Re-factors threat hunting library code for modularity and expanded functionality for hunters. Below is an incomplete list of changes:

Add an index.yml to capture and store available hunts, UUIDs, etc.
change hunting to be it's own package with CLI
update generate-markdown functionality to allow single file path, folder (i.e. aws)
add refresh index command to update the index
add a search command to allow MITRE (+integration)-based searching for hunters
added a run-query command to run a query against elasticsearch search API and let users know if matches exist or not
added a view-hunt command to show contents of a hunting file in JSON or TOML format
updated README

## Additional Information

### Fixing Okta folder structure From https://github.com/elastic/detection-rules/pull/4064, the folder structure was merged in as `okta/docs/docs` and `okta/docs/queries`. As a result, when we would create the index files, `docs` was the header and it also broke searching. To fix this, we adjusted the filer structure to be `okta/docs` and `okta/queries` as expected. ### Adding `index.yml` Prior to this PR, we had `index.md` which was a markdown file that stored an index of all of our hunts. This allowed visitors to see a centralized list of available hunting queries grouped by data source. However, for programmatic use in logic, it works better to also have an data-based index that can be easily referenced and loaded, thus the introduction of `index.yml`. In this index, we have the following... - UUID - Hunting Analytic Name - Location (File Path) - MITRE (Technique & Sub-Technique IDs) This YAML file is then used for refreshing the index and the markdown file, identifying file locations, also making it easier to search for specific queries based on MITRE information, such as the new `search` command. This will be the source of truth for hunting queries library. We have also added `refresh-index` command which updates the YAML index from all hunting queries and then updates the markdown. Screenshot 2024-09-18 at 2 52 21 PM

### Add CLI to Hunting Module Prior to this PR, `hunting/` was a top-level module. An additional `__main__.py` and `click` module allows us to call the module from the top-level of the repo as such: `python -m hunting`. As such it is similar to `rta` and stands as a new feature of the detection rules repository. However, this still allows us to leverage modules or libraries from within `detection_rules` as we do with `attack.py` to not have so much redundancy. Commands can be added to `__main__.py` if we want to extend features and capabilities. There are other OOP changes such as adding `utils.py` and `definitions.py` to replicate the same structure as the `detection_rules` module but keep it simple. Screenshot 2024-09-18 at 2 51 00 PM

### Adjust Markdown Generation Prior to this PR, we were only capable of generating the markdown from TOML files for ALL queries at once, which required us to load all TOML hunting files and overwrite markdown files which was unnecessary. Rather, we have added `generate-markdown` command which then calls `process_toml_files` with parameters for both a single TOML filepath, a data source string such as `aws` or simply all docs. Screenshot 2024-09-18 at 2 50 32 PM

### Added a MITRE `search` command While the repository is meant to be a simplistic representation of all our hunting queries, allowing visitors to copy and paste from markdown or programmatically load TOML files, as we build out more queries and docs, searching for the right hunt may be daunting. As search a simple and effective approach to enable visitors is to allow them to search for specific hunting queries based on the following: - Tactic ID - Returns all hunting queries whose techniques or sub-techniques fall within the bucket of that tactic ID - Technique ID - Returns all hunting queries whose technique IDs match or sub-techniques whom fall into the bucket of that technique - Subtechnique ID - Returns all hunting queries whose sub-technique IDs match - Data Source - Allow an additional filter for data source, such as `aws` or `okta`. This proves useful if you want to search, for example, all TA0001 (Credential Access) related to Okta. Screenshot 2024-09-18 at 2 48 41 PM

### Run Query In order to do this, you must add a `.detection-rules-cfg.yaml` config with your Cloud ID and API key for authentication. `python -m hunting run-query --file-path /Users/tdejesus/code/src/detection-rules/hunting/linux/queries/low_volume_external_network_connections_from_process.toml` Screenshot 2024-09-25 at 2 28 49 PM

`python -m hunting run-query --uuid ef579900-75ef-11ef-b47f-f661ea17fbcc ` Screenshot 2024-09-25 at 2 28 14 PM

`python -m hunting run-query --uuid 6e57e6a6-f150-405d-b8be-e4e666a3a86d` Screenshot 2024-09-25 at 2 31 58 PM

### View Hunt ` python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --format json` Screenshot 2024-09-25 at 2 29 43 PM

`python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --query-only ` Screenshot 2024-09-25 at 2 30 10 PM

## How To Test

### Run Unit-Tests `python -m unittest tests/test_hunt_data.py` ### Search for a specific tactic, technique or sub-technique ID in MITRE `python -m hunting search --subtechnique T1078.004 --data-source aws` `python -m hunting search --tactic TA0001 --data-source okta` `python -m hunting search --technique T1078` ### Refresh the index (feel free to remove index entry and see if it populates again) `python -m hunting refresh-index` ### Generate markdown `python -m hunting generate-markdown /Users/tdejesus/code/src/detection-rules/hunting/okta/queries/initial_access_higher_than_average_failed_authentication.toml` `python -m hunting generate-markdown okta` `python -m hunting generate-markdown` ### Run Query In order to do this, you must add a `.detection-rules-cfg.yaml` config with your Cloud ID and API key for authentication. `python -m hunting run-query --file-path /Users/tdejesus/code/src/detection-rules/hunting/linux/queries/low_volume_external_network_connections_from_process.toml` `python -m hunting run-query --uuid ef579900-75ef-11ef-b47f-f661ea17fbcc ` `python -m hunting run-query --uuid 6e57e6a6-f150-405d-b8be-e4e666a3a86d` ### View Hunt `python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --format json` `python -m hunting view-hunt --uuid 12526f14-5e35-4f5f-884c-96c6a353a544 --query-only `

Checklist

[x] Added a label for the type of pr: bug, enhancement, schema, Rule: New, Rule: Deprecation, Rule: Tuning, Hunt: New, or Hunt: Tuning so guidelines can be generated
[ ] Added the meta:rapid-merge label if planning to merge within 24 hours
[x] Secret and sensitive material has been managed correctly
[x] Automated testing was updated or added to match the most common scenarios
[x] Documentation and comments were added for features that require explanation

Contributor checklist

Have you signed the contributor license agreement?
Have you followed the contributor guidelines?

protectionsmachine commented 2 weeks ago

Enhancement - Guidelines

These guidelines serve as a reminder set of considerations when addressing adding a new schema feature to the code.

Documentation and Context

[ ] Describe the feature enhancement in detail (alternative solutions, description of the solution, etc.) if not already documented in an issue.
[ ] Include additional context or screenshots.
[ ] Ensure the enhancement includes necessary updates to the documentation and versioning.

Code Standards and Practices

[ ] Code follows established design patterns within the repo and avoids duplication.
[ ] Code changes do not introduce new warnings or errors.
[ ] Variables and functions are well-named and descriptive.
[ ] Any unnecessary / commented-out code is removed.
[ ] Ensure that the code is modular and reusable where applicable.
[ ] Check for proper exception handling and messaging.

Testing

[ ] New unit tests have been added to cover the enhancement.
[ ] Existing unit tests have been updated to reflect the changes.
[ ] Provide evidence of testing and validating the enhancement (e.g., test logs, screenshots).
[ ] Validate that any rules affected by the enhancement are correctly updated.
[ ] Ensure that performance is not negatively impacted by the changes.
[ ] Verify that any release artifacts are properly generated and tested.

Additional Schema Related Checks

[ ] Ensure that the enhancement does not break existing functionality. (e.g., run make test-cli)
[ ] Review the enhancement with a peer or team member for additional insights.
[ ] Verify that the enhancement works across all relevant environments (e.g., different OS versions).
[ ] Confirm that all dependencies are up-to-date and compatible with the changes.
[ ] Link to the relevant Kibana PR or issue provided
[ ] Exported detection rule(s) from Kibana to showcase the feature(s)
[ ] Converted the exported ndjson file(s) to toml in the detection-rules repo
[ ] Re-exported the toml rule(s) to ndjson and re-imported into Kibana
[ ] Updated necessary unit tests to accommodate the feature
[ ] Applied min_compat restrictions to limit the feature to a specified minimum stack version
[ ] Executed all unit tests locally with a test toml rule to confirm passing
[ ] Included Kibana PR implementer as an optional reviewer for insights on the feature
[ ] Implemented requisite downgrade functionality
[ ] Cross-referenced the feature with product documentation for consistency
[ ] Incorporated a comprehensive test rule in unit tests for full schema coverage
[ ] Conducted system testing, including fleet, import, and create APIs (e.g., run make test-remote-cli)

terrancedejesus commented 2 weeks ago

Putting this back into Draft, going to work on the following to finalize:

Add a run command to fling queries at the API
Add a disclaimer MD file

imays11 commented 2 weeks ago

Great tool @terrancedejesus, awesome way to make the hunts more easily accessible for users! Something I noticed while using the hunting tool is that the file paths provided by the search command assume the working directory is the hunting directory. However, if you're using the tool from the root directory, detection_rules, then these links aren't as convenient to copy-paste > cat file. Maybe changing the links to be the full path starting at the root directory? This way even if someone is operating in the hunting folder, or another sub-folder, it's easier to copy the portion of the path they need, rather than someone having to cd to the hunting directory before conveniently using the file path.

terrancedejesus commented 1 week ago

@Mikaayenson @shashank-elastic @eric-forte-elastic - I have addressed the feedback recorded as noted below. Additionally, Mika had mentioned making the package more modular and thus I have updated them to have a class with that can be instantiated with methods to handle most of the logic that is called from within each command. Thanks again for the feedback!

Additional hunting tests
Check ESQL client
Check if get_hunt_path can be re-used elsewhere
Do we remain consistency with MITRE or not (May not be required. The naming of the hunts are more important as the file name is never displayed.
Bug in search with LLM (fixed)
Add note in README about connecting to ES with config
Address guidelines for hunting
Textwrap tabulate output from running query (removed tabulate from run-query command)
Change the numbering so that starts 1+
Update README for Detection Rules README
Check if we are introducing regressions
Add test-hunting-cli
Update makefile

terrancedejesus commented 1 week ago

Nice addition of summary @Mikaayenson !

terrancedejesus commented 6 days ago

@traut

I think we should stop using click as a logging solution (we do the same in REACT). It's great for parsing CLI arguments (though I'm using typer in cortado for simplicity), but it lacks the flexibility and power a proper logging library would provide -- my favourite here is structlog.

We have discussed logging in detection rules for awhile now and I agree, proper logging would be great for using the repo or the many supported features such as RTA, Detection Rules, Hunting, etc. This would be a good future improvement, but IMO not to be introduced in this PR.

if we're setting type hints we might as well enforce the correctness with pyright. I see it in REACT as well. Without enforcement, they are just there for decoration :)

Agreed for enforcement, but out-of-scope for this PR if I understand. More as an overall repository addition since it requires introducing pyright to the repo, adding a config, adjusting CI/CD builds, etc.

elastic / detection-rules

[Hunting] Re-factor Hunting Library Code #4085